Understanding Postgres Database Triggers
I’ve always known that my knowledge and usage of postgres was elementary. Triggers and extensions were two topics that intrigued me, and I recently found a good excuse to spend the time to understand them. Here’s what I learned!
Introduction to Triggers
Here’s how triggers work:
TG_ARGV
contains an array of arguments passed to the trigger- Which arguments are passed to the trigger are defined when it is created by
CREATE TRIGGER
- Here’s an example function creation SQL
CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS TRIGGER AS $body$
audit
is the schema/namespace$body$
indicates the start of the method body, think of it as a heredoc string
- It seems as though you can define a function multiple times, with different args, like elixir. Neat and confusing.
- To execute functions
SELECT schema.function_name('param');
. Think ofSELECT
as aeval
is SQL in this context. - Variables are set using
DECLARE var_name text = 'VARIABLE_VALUE';
although is not valid in a normal SQL statement and only in aFUNCTION
RAISE EXCEPTION 'text'
allows you to throw an exception within a function$
can be used like a"
in any normal language. Kind of like a heredoc.- Adding comments to columns, tables, and schema is a powerful way to document your SQL and even more important when creating new schema in an extension.
- I have no idea how you debug these outside of caveman trial & error-style debugging. These seem like the most obvious downside of triggers: you want them to be as small and simple as possible to reduce debugging risk.
- What’s interesting is it seems like the function logic uses a completely separate set of SQL than the SQL you use to query. There’s if statements, variables, a
ROW()
function for insertions, etc. This feels really weird as a design decision.- Doing more research here looks like there is a specific extension of the standard postgres SQLfor extensions/stored procedures (functions).
- When functions are defined, you’ll see
$$ LANGUAGE plpgsql;
at the end of the definition indicating the specific language the extension is written in. - You can write extensions in multiple languages,
plpgsql
is just one of them
Some SQL Findings
Everytime I do something semi-serious with SQL I come away with a bunch of learnings:
'{one, two, three}'::text[]
andARRAY['one', 'two', 'three']
are the same expression- Access the array by using
SELECT (ARRAY['version_col', 'changed_by', 'changed_timestamp'])[1];
the parens are important. SELECT last_value FROM entity_id_seq
will get you the auto increment counterSELECT setval('entity_id_seq', (SELECT MAX(id) FROM entity))
can update the counter for that table if it’s not matching reality (after an import or data issue)- A schema is bascially a namespace within a database.
Installing & Using the Audit Trigger
My primary goal was to setup audit logging for a json payload on a particular table. I found an open source trigger and made some improvements to it. It’s a great example of a medium-complexity trigger, what they look like and how they work.
The trick was how do I get it integrated + tested in my application? Here’s how I built the migration and test for the audit log:
defmodule App.Repo.Migrations.AddAuditTrigger do
use Ecto.Migration
def up do
abs_path = Path.expand("priv/repo/audit_trigger.sql")
database_url = Application.fetch_env!(:app, App.Repo)[:url]
# for a separate time, but executing multiple SQL statements in Ecto is impossible
{:ok, result} =
PsqlExecutor.execute_sql_file(database_url, abs_path)
IO.puts(result)
table = App.Entity.__schema__(:source) |> to_string()
# create a JSON-like string of the columns we want to exclude
# we only want to watch the crawl_payload (and the id, which should never change)
columns =
(App.Entity.__schema__(:fields) -- [:crawl_payload, :id])
|> Enum.map(&Atom.to_string/1)
|> Enum.map(&"'#{&1}'")
|> Enum.join(",")
:ok =
execute("""
SELECT audit.audit_table(
'#{table}',
-- audit rows, not just tables
'true',
-- output sql which triggered the audit
'false',
-- list of *exclusion* columns
ARRAY[#{columns}]);
""")
end
def down do
:ok = execute("DROP SCHEMA IF EXISTS audit CASCADE;")
end
end
And here’s an example application-layer test case:
test "triggers" do
entity = fixture(:entity)
entity_record = App.Entity.create_entity_from_payload(entity)
App.Entity.changeset(entity_record, %{json_payload: %{"something" => "changed"}})
|> App.Repo.update!()
# raw ecto execute select on audit.logged_actions
result = App.Repo.sql("SELECT * FROM audit.logged_actions")
result.rows
assert result.num_rows == 2
[insert_audit, update_audit] = result.rows
assert Enum.at(insert_audit, -1) == nil
assert Enum.at(update_audit, -1) == entity_record.id
previous_payload_column = -4
changed_payload_column = -3
assert nil ==
Enum.at(insert_audit, changed_payload_column)
assert %{"json_payload" => "{\"something\": \"changed\"}"} ==
Enum.at(update_audit, changed_payload_column)
end
Merge Multiple Git Branches
One common pattern I run into with working with zombie open source projects (like audit trigger) is there are a bunch of PRs that should be merged, that aren’t. But I want them merged into my fork; and I’d love to try to convince the original author to merge them as well.
One trick I’ve developed for myself is a custom .git-custom-branch
file along with a custom zsh script to merge all the branches together into a custom
branch:
deaudit_table
patch-1
peterbecker/master
michelmilezzi/master
This makes it easy to cherry-pick branches and mix them in with my own. You can easily pluck PRs to your local repo using the excellent GH cli gh pr checkout 32
.