Understanding Postgres Database Triggers

September 12, 2023

Tags: databases, postgres • Categories: Learning

Table of Contents

I’ve always known that my knowledge and usage of postgres was elementary. Triggers and extensions were two topics that intrigued me, and I recently found a good excuse to spend the time to understand them. Here’s what I learned!

Introduction to Triggers

Here’s how triggers work:

TG_ARGV contains an array of arguments passed to the trigger
Which arguments are passed to the trigger are defined when it is created by CREATE TRIGGER
Here’s an example function creation SQL CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS TRIGGER AS $body$
- audit is the schema/namespace
- $body$ indicates the start of the method body, think of it as a heredoc string
It seems as though you can define a function multiple times, with different args, like elixir. Neat and confusing.
To execute functions SELECT schema.function_name('param');. Think of SELECT as a eval is SQL in this context.
Variables are set using DECLARE var_name text = 'VARIABLE_VALUE'; although is not valid in a normal SQL statement and only in a FUNCTION
RAISE EXCEPTION 'text' allows you to throw an exception within a function
$ can be used like a " in any normal language. Kind of like a heredoc.
Adding comments to columns, tables, and schema is a powerful way to document your SQL and even more important when creating new schema in an extension.
I have no idea how you debug these outside of caveman trial & error-style debugging. These seem like the most obvious downside of triggers: you want them to be as small and simple as possible to reduce debugging risk.
What’s interesting is it seems like the function logic uses a completely separate set of SQL than the SQL you use to query. There’s if statements, variables, a ROW() function for insertions, etc. This feels really weird as a design decision.
- Doing more research here looks like there is a specific extension of the standard postgres SQLfor extensions/stored procedures (functions).
- When functions are defined, you’ll see $$ LANGUAGE plpgsql; at the end of the definition indicating the specific language the extension is written in.
- You can write extensions in multiple languages, plpgsql is just one of them

Some SQL Findings

Everytime I do something semi-serious with SQL I come away with a bunch of learnings:

'{one, two, three}'::text[] and ARRAY['one', 'two', 'three'] are the same expression
Access the array by using SELECT (ARRAY['version_col', 'changed_by', 'changed_timestamp'])[1]; the parens are important.
SELECT last_value FROM entity_id_seq will get you the auto increment counter
SELECT setval('entity_id_seq', (SELECT MAX(id) FROM entity)) can update the counter for that table if it’s not matching reality (after an import or data issue)
A schema is bascially a namespace within a database.

Installing & Using the Audit Trigger

My primary goal was to setup audit logging for a json payload on a particular table. I found an open source trigger and made some improvements to it. It’s a great example of a medium-complexity trigger, what they look like and how they work.

The trick was how do I get it integrated + tested in my application? Here’s how I built the migration and test for the audit log:

defmodule App.Repo.Migrations.AddAuditTrigger do
  use Ecto.Migration

  def up do
    abs_path = Path.expand("priv/repo/audit_trigger.sql")
    database_url = Application.fetch_env!(:app, App.Repo)[:url]

    # for a separate time, but executing multiple SQL statements in Ecto is impossible
    {:ok, result} =
      PsqlExecutor.execute_sql_file(database_url, abs_path)

    IO.puts(result)

    table = App.Entity.__schema__(:source) |> to_string()

    # create a JSON-like string of the columns we want to exclude
    # we only want to watch the crawl_payload (and the id, which should never change)
    columns =
      (App.Entity.__schema__(:fields) -- [:crawl_payload, :id])
      |> Enum.map(&Atom.to_string/1)
      |> Enum.map(&"'#{&1}'")
      |> Enum.join(",")

    :ok =
      execute("""
      SELECT audit.audit_table(
        '#{table}',
        -- audit rows, not just tables
        'true',
        -- output sql which triggered the audit
        'false',
        -- list of *exclusion* columns
        ARRAY[#{columns}]);
      """)
  end

  def down do
    :ok = execute("DROP SCHEMA IF EXISTS audit CASCADE;")
  end
end

And here’s an example application-layer test case:

test "triggers" do
  entity = fixture(:entity)
  entity_record = App.Entity.create_entity_from_payload(entity)

  App.Entity.changeset(entity_record, %{json_payload: %{"something" => "changed"}})
  |> App.Repo.update!()

  # raw ecto execute select on audit.logged_actions
  result = App.Repo.sql("SELECT * FROM audit.logged_actions")
  result.rows

  assert result.num_rows == 2
  [insert_audit, update_audit] = result.rows

  assert Enum.at(insert_audit, -1) == nil
  assert Enum.at(update_audit, -1) == entity_record.id

  previous_payload_column = -4
  changed_payload_column = -3

  assert nil ==
            Enum.at(insert_audit, changed_payload_column)

  assert %{"json_payload" => "{\"something\": \"changed\"}"} ==
            Enum.at(update_audit, changed_payload_column)
end

Merge Multiple Git Branches

One common pattern I run into with working with zombie open source projects (like audit trigger) is there are a bunch of PRs that should be merged, that aren’t. But I want them merged into my fork; and I’d love to try to convince the original author to merge them as well.

One trick I’ve developed for myself is a custom .git-custom-branch file along with a custom zsh script to merge all the branches together into a custom branch:

deaudit_table
patch-1
peterbecker/master
michelmilezzi/master

This makes it easy to cherry-pick branches and mix them in with my own. You can easily pluck PRs to your local repo using the excellent GH cli gh pr checkout 32.

Introduction to Triggers

Some SQL Findings

Installing & Using the Audit Trigger

Merge Multiple Git Branches

Keep in Touch