Skip to content

Understanding Postgres Database Triggers

Tags: databases, postgres • Categories: Learning

Table of Contents

I’ve always known that my knowledge and usage of postgres was elementary. Triggers and extensions were two topics that intrigued me, and I recently found a good excuse to spend the time to understand them. Here’s what I learned!

Introduction to Triggers

Here’s how triggers work:

  • TG_ARGV contains an array of arguments passed to the trigger
  • Which arguments are passed to the trigger are defined when it is created by CREATE TRIGGER
  • Here’s an example function creation SQL CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS TRIGGER AS $body$
    • audit is the schema/namespace
    • $body$ indicates the start of the method body, think of it as a heredoc string
  • It seems as though you can define a function multiple times, with different args, like elixir. Neat and confusing.
  • To execute functions SELECT schema.function_name('param');. Think of SELECT as a eval is SQL in this context.
  • Variables are set using DECLARE var_name text = 'VARIABLE_VALUE'; although is not valid in a normal SQL statement and only in a FUNCTION
  • RAISE EXCEPTION 'text' allows you to throw an exception within a function
  • $ can be used like a " in any normal language. Kind of like a heredoc.
  • Adding comments to columns, tables, and schema is a powerful way to document your SQL and even more important when creating new schema in an extension.
  • I have no idea how you debug these outside of caveman trial & error-style debugging. These seem like the most obvious downside of triggers: you want them to be as small and simple as possible to reduce debugging risk.
  • What’s interesting is it seems like the function logic uses a completely separate set of SQL than the SQL you use to query. There’s if statements, variables, a ROW() function for insertions, etc. This feels really weird as a design decision.
    • Doing more research here looks like there is a specific extension of the standard postgres SQLfor extensions/stored procedures (functions).
    • When functions are defined, you’ll see $$ LANGUAGE plpgsql; at the end of the definition indicating the specific language the extension is written in.
    • You can write extensions in multiple languages, plpgsql is just one of them

Some SQL Findings

Everytime I do something semi-serious with SQL I come away with a bunch of learnings:

  • '{one, two, three}'::text[] and ARRAY['one', 'two', 'three'] are the same expression
  • Access the array by using SELECT (ARRAY['version_col', 'changed_by', 'changed_timestamp'])[1]; the parens are important.
  • SELECT last_value FROM entity_id_seq will get you the auto increment counter
  • SELECT setval('entity_id_seq', (SELECT MAX(id) FROM entity)) can update the counter for that table if it’s not matching reality (after an import or data issue)
  • A schema is bascially a namespace within a database.

    Installing & Using the Audit Trigger

    My primary goal was to setup audit logging for a json payload on a particular table. I found an open source trigger and made some improvements to it. It’s a great example of a medium-complexity trigger, what they look like and how they work.

The trick was how do I get it integrated + tested in my application? Here’s how I built the migration and test for the audit log:

defmodule App.Repo.Migrations.AddAuditTrigger do
  use Ecto.Migration

  def up do
    abs_path = Path.expand("priv/repo/audit_trigger.sql")
    database_url = Application.fetch_env!(:app, App.Repo)[:url]

    # for a separate time, but executing multiple SQL statements in Ecto is impossible
    {:ok, result} =
      PsqlExecutor.execute_sql_file(database_url, abs_path)

    IO.puts(result)

    table = App.Entity.__schema__(:source) |> to_string()

    # create a JSON-like string of the columns we want to exclude
    # we only want to watch the crawl_payload (and the id, which should never change)
    columns =
      (App.Entity.__schema__(:fields) -- [:crawl_payload, :id])
      |> Enum.map(&Atom.to_string/1)
      |> Enum.map(&"'#{&1}'")
      |> Enum.join(",")

    :ok =
      execute("""
      SELECT audit.audit_table(
        '#{table}',
        -- audit rows, not just tables
        'true',
        -- output sql which triggered the audit
        'false',
        -- list of *exclusion* columns
        ARRAY[#{columns}]);
      """)
  end

  def down do
    :ok = execute("DROP SCHEMA IF EXISTS audit CASCADE;")
  end
end

And here’s an example application-layer test case:

test "triggers" do
  entity = fixture(:entity)
  entity_record = App.Entity.create_entity_from_payload(entity)

  App.Entity.changeset(entity_record, %{json_payload: %{"something" => "changed"}})
  |> App.Repo.update!()

  # raw ecto execute select on audit.logged_actions
  result = App.Repo.sql("SELECT * FROM audit.logged_actions")
  result.rows

  assert result.num_rows == 2
  [insert_audit, update_audit] = result.rows

  assert Enum.at(insert_audit, -1) == nil
  assert Enum.at(update_audit, -1) == entity_record.id

  previous_payload_column = -4
  changed_payload_column = -3

  assert nil ==
            Enum.at(insert_audit, changed_payload_column)

  assert %{"json_payload" => "{\"something\": \"changed\"}"} ==
            Enum.at(update_audit, changed_payload_column)
end

Merge Multiple Git Branches

One common pattern I run into with working with zombie open source projects (like audit trigger) is there are a bunch of PRs that should be merged, that aren’t. But I want them merged into my fork; and I’d love to try to convince the original author to merge them as well.

One trick I’ve developed for myself is a custom .git-custom-branch file along with a custom zsh script to merge all the branches together into a custom branch:

deaudit_table
patch-1
peterbecker/master
michelmilezzi/master

This makes it easy to cherry-pick branches and mix them in with my own. You can easily pluck PRs to your local repo using the excellent GH cli gh pr checkout 32.