Welcome to my website! I’m Jonathan Y. Chan (jyc, jonathanyc, 陳樂恩, or 은총), a 🐩 Yeti fan, 🇺🇸 American, and 🐻 Californian, living in 🌁 San Francisco: the most beautiful city in the greatest country in the world. My mom is from Korea and my dad was from Hong Kong. I am a Christian. I’ve worked on:

Blog

Backing up iCloud Photos using rsync

Here’s the copy-icloud-photos script I use to backup my photos stored on iCloud to my Synology NAS:

#!/bin/bash
set -euo pipefail

args=(
  --delete
  --human-readable
  --no-perms
  --partial
  --progress
  --times
  -v
)

src="/Users/jyc/Pictures/Photos Library.photoslibrary/originals/"
cd "$src"
find ./ -cmin +1440 -print0 |
  rsync --files-from=- --from0 \
    "${args[@]}" \
    "./" \
    nas.home:/var/services/homes/jyc/Photos/iCloudViaMac

I added find recently because it’s annoying to accidentally backup temporary photos, like screenshots, that only live in my iPhone’s Camera Roll for a minute or so before I delete them.

I have launchd run that script daily using a configuration plist at ~/Library/LaunchAgents/jyc.copy-icloud-photos.service:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Disabled</key>
  <false/>
  <key>Label</key>
  <string>copy-icloud-photos</string>
  <key>ProgramArguments</key>
  <array>
    <string>/usr/local/bin/fdautil</string>
    <string>exec</string>
    <string>/Users/jyc/bin/copy-icloud-photos</string>
  </array>
  <key>StandardErrorPath</key>
  <string>/tmp/copy-icloud-photos.err</string>
  <key>StandardOutPath</key>
  <string>/tmp/copy-icloud-photos.out</string>
  <key>StartInterval</key>
  <integer>86400</integer>
</dict>
</plist>

I set it up via LaunchControl, which is a third-party shareware GUI for launchd that also provides the fdautil wrapper script that makes it possible for the copy-icloud-photos script to have full disk access. I think it’s possible to get this to work without LaunchControl but I haven’t tried.

Unfortunately a big caveat is that this will back up recently deleted photos until they are truly deleted by iCloud. Here’s some lists filenames of non-deleted non-hidden photos under ~/Pictures/Photos Library.photoslibrary/originals/ when run on the database at ../Photos.sqlite:

select substr(ZFILENAME, 1, 1) || '/' || ZFILENAME
from ZASSET
where ZTRASHEDSTATE = 1 and ZHIDDEN = 0;

… but even when I grant bash, copy-icloud-photos, and sqlite3 Full Disk Access in System Settings > Privacy & Security, I can’t get it to work. I thought I might just need to grant my script Photos access as well, but that doesn’t work. Maybe Apple really is trying to block all programmatic access except through PhotoKit.

I am Culgi, who has been chosen by Inana for his attractiveness.
…
Because I am a powerful man who enjoys using his thighs, I, Culgi, the mighty king, superior to all, strengthened the roads, put in order the highways of the Land.
…
So that my name should be established for distant days and never fall into oblivion, so that my praise should be uttered throughout the Land, and my glory should be proclaimed in the foreign lands, I, the fast runner, summoned my strength and, to prove my speed, my heart prompted me to make a return journey from Nibru to brick-built Urim as if it were only the distance of a double-hour.
— A praise poem of Shulgi (Shulgi A)

Sumerians didn’t skip leg day or cardio.

Notes on "Efficient Natural Language Response Suggestion for Smart Reply" by Henderson et al.

Previously: One-Paragraph Reviews, Vol. I

I didn’t manage to stick to the one-paragraph format this time. I’m trying to write down:

everything that was novel and notable to me when reading the paper
as concisely as possible

… but (1) can be a lot of stuff because the things I’m reading about are generally things on which I’m not an expert! I’ll try moving stuff that isn’t related to the main point into footnotes to cheat. If the trend continues, though, I’ll have to think of how to make things more concise…

“Efficient Natural Language Response Suggestion for Smart Reply” is a paper by Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, and Ray Kurzweil (2017) on the algorithm behind Google’s pre-LLM¹ “Smart Reply” feature, which suggests short replies like “I think it’s fine” or “It needs some work.”

The authors train a model composed of two neural network “towers”, one for the input email and one for the reply: each takes a vector representing an email, encoded as the sum² of the n-gram embeddings of its words. The model learns to computes two vectors, $h_x$ for input emails and $h_y$ for response emails, such that $P(y|x) = h_x \cdot h_y$ is the probability that an email $y$ is the reply to an email $x$.

There are a few post-processing steps:

adding $\alpha \log P_{\text{LM}}(y)$ to the score, where $\alpha$ is an arbitrary constant and $P_{\text{LM}}$ is computed by a language model, because the learned score $h_x \cdot h_y$ is biased towards “specific and long responses instead of short and generic ones;”
a “diversification” stage where responses are clustered, to “omit redundant suggestions… and ensure a negative suggestion is given if the other two are affirmative and vice-versa”; and
instead of computing full dot products when searching for appropriate response emails, computing smaller quantized³ dot products.

These days, you might use someone else’s text embedding model for $h_x$ and $h_y$, but you’d still need the post-processing steps; you would also need some transformation from input vectors to reply vectors so that $h_x \cdot h_y$ represents “$h_y$ is a reply to $h_x$” rather than just “$h_y$ is similar to $h_x$.” I wonder if LLMs might become cheap enough that $P_{\text{LM}}$ becomes all you need, similar to how spellchecking used to be an engineering feat but is now “3-5 lines of Python.”

Seq2Seq, the direct ancestor of the current generation of GPT-style LLMs, already existed at the time, but the authors wanted something more efficient.

C.f. the sinusoidal or learned positional encoding used in many current models. Sinusoidal positional encodings have a vaguely geometric interpretation: a word/token at a given position in a sentence is the token’s embedding vector with a translation applied, such that the distance between the translation applied to tokens at two positions is “symmetrical and decays nicely with time”.

They learn a “hierarchical quantization” for each vector such that $h_y \approx \text{VQ}(h_y) + \text{R}^T \text{PQ}(r_y)$, where $\text{VQ}$ is a vector quantization, $\text{R}$ is a rotation, and $\text{PQ}$ is a product quantization (the Cartesian product of $\mathcal K$ independent vector quantizers). Vector quantization just means expressing a $a$-dimensional vector as a linear combination of $b$ other vectors (the “codebook”); compression comes from $b < a$. It feels vaguely reminiscent of $k$-means, which predicts the output for a given input using the $k$ nearest input vectors.

Elixir map iteration order is very undefined

The iteration order for Elixir maps is not just “undefined” in the sense that there is some order at runtime which you don’t know. Different functions that take maps can also iterate over the map in different orders!

Lists have the iteration order you’d expect:

range = 1..32
Enum.map(range, fn a -> a end)
Enum.zip_with(range, range, fn a, b -> {a, b} end)
# [1, 2, 3, ...]
# [{1, 1}, {2, 2}, {3, 3}, ...]

… and so do maps with 32 or fewer entries:

range = 1..32
map = Enum.map(range, &{&1, true}) |> Enum.into(%{})
IO.inspect(Enum.map(map, fn {k, _v} -> k end))
IO.inspect(Enum.zip_with(range, map, fn _, {k, _v} -> k end))
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
#   22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
#   22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]

… but add one entry to a map and the pattern breaks:

range = 1..33
# ...
# [4, 25, 8, 1, 23, 10, 7, 9, 11, 12, 28, 24, 13, 3, 18, 29, 26, 22, 19, 2, 33,
#   21, 32, 20, 17, 30, 14, 5, 6, 27, 16, 31, 15]
# [15, 31, 16, 27, 6, 5, 14, 30, 17, 20, 32, 21, 33, 2, 19, 22, 26, 29, 18, 3,
#   13, 24, 28, 12, 11, 9, 7, 10, 23, 1, 8, 25, 4]

Enum.zip_with happens to enumerate over the entries of a map in opposite order from Enum.map!

I think it’s especially funny that this behavior only manifests for maps with more than 32 elements! It reminds me of this plotline (no spoilers) from Cixin Liu’s mind-blowing Remembrance of Earth’s Past trilogy:

“These high-energy particle accelerators raised the amount of energy available for colliding particles by an order of magnitude, to a level never before achieved by the human race. Yet, with the new equipment, the same particles, the same energy levels, and the same experimental parameters would yield different results. Not only the results would vary if different accelerators were used, but even with the same accelerator, experiments performed at different times would give different results. Physicists panicked. …”

“What does this mean? Wang asked. …

“It means that the laws of physics are not invariant across time and space.”

On a less dramatic note, it reminds me of the Borwein integrals discovered by David Borwein and Jonathan Borwein in 2001:

$$ \int_0^\infty \frac{\sin(x)}{x} dx = \frac{\pi}{2} $$ $$ \int_0^\infty \frac{\sin(x)}{x} \frac{\sin(x/3)}{x/3} dx = \frac{\pi}{2} $$ $$ \int_0^\infty \frac{\sin(x)}{x} \frac{\sin(x/3)}{x/3} \cdots \frac{\sin(x/13)}{x/13} dx = \frac{\pi}{2} $$ $$ \int_0^\infty \frac{\sin(x)}{x} \frac{\sin(x/3)}{x/3} \cdots \frac{\sin(x/15)}{x/15} dx = \frac{\pi}{2} - 2.32 \times 10^{-11} $$

It’s interesting to think about the different kinds of behavior which you can’t know ahead-of-time. Suppose I roll some dice inside of a closed box.

Non-deterministic but fixed. When I open the box, I see the dice have some value which I couldn’t predict, but which is the same regardless of how I open the box.
Not fixed. After I’ve opened the box, every time I look at the dice, their values have changed.
Depending on how I open the box, the dice have different values.

Unicode codepoint ranges for emoji

I’d assumed that emoji were all organized into a contiguous Unicode codepoint range, but this is very much not the case! There are more than a thousand different ranges containing emoji. The Unicode consortium makes the complete list available as a file, emoji-data.txt.

Here are a few lines:

25FB..25FE    ; Emoji                # E0.6   [4] (◻️..◾)    white medium square..black medium-small square
2600..2601    ; Emoji                # E0.6   [2] (☀️..☁️)    sun..cloud
2602..2603    ; Emoji                # E0.7   [2] (☂️..☃️)    umbrella..snowman
2604          ; Emoji                # E1.0   [1] (☄️)       comet

I wanted to convert this list into the form U+25FB-25FE,U+2600-2601,... for use with the kitty terminal’s symbol_map configuration option.

I wrote some shell to convert it into that format:

curl https://www.unicode.org/Public/UCD/latest/ucd/emoji/emoji-data.txt \
  | grep -v '^#' \
  | sed -e 's/;.*//' -e 's/[[:space:]]//g' -e '/^$/d' -e 's/\.\./-/' -e 's/^/U+/' \
  | sort \
  | uniq \
  | tr '\n' ',' \
  | sed 's/,$//'

Some pretty big caveats:

emoji-data.txt also contains ASCII codepoints like # (0x23), * (0x2a), and 0-9 (0x30-0x39)! Depending on your usecase you might want to remove these.
One rendered emoji can be composed from multiple codepoints. For example, the emoji 🏋️‍♂️ (man lifting weights) is composed from three codepoints: person lifting weights + zero width joiner + male sign. All three codepoints are listed separately in emoji-data.txt.

One Paragraph Reviews, Vol. I

Going to try and see if this format helps me get through the backlog of reviews I’ve been meaning to write. The schema I’ll try is: (1) why it’s interesting (2) the most interesting insight.

3D Gaussian Splatting for Real-Time Radiance Field Rendering by Kerbl, Kopanas, Leimkühler, and Drettakis (2023). The authors reconstruct 3D scenes from 2D images and render much faster than before (≥ 100fps) by representing them as clouds of blurry balls. No neural networks–backpropagation is used to position the blurry balls (really “anisotropic Gaussians”; anisotropic just means they are rotated/skewed); errors are propagated all the way back from image-space pixels to world-space Guassians! Another interesting non-neural network use of backprop is Constrain by Prof. Andrew Myers at Cornell, which uses backprop for constraint-based 2D graphics.
John Calvin’s Anxiety by William J. Bouwsma (1984). Calvinism is mostly known to others¹ for the doctrine of predestination; essentially the idea that God alone chooses who to save: not even the saved get a choice! This sounds fatalistic, so it’s interesting that Bouwsma, who was a history professor at Berkeley, observes that: “Anxiety is a motif that beats through almost everything Calvin wrote.” Bouwsma thinks that Calvin’s anxiety about “fragile [physical] world” is connected to Calvin’s conception of a “constantly active” God so powerful as to appear “arbitrary.”
Compiler and Runtime Support for Continuation Marks by Flatt and Dybvig (2020). Continuation marks are similar to dynamically-scoped variables, like UNIX shell environment variables. In languages like Scheme with first-class continuations, they can be used to efficiently implement features like exceptions, which in e.g. Java you could never implement outside of the compiler. The paper is mostly about the efficient implementation of continuation marks in Chez Scheme, but it has a good overview of continuations and continuation marks.
The Remains of the Day by Kazuo Ishiguro (1989). An English butler reminisces as he leaves on a road trip from Darlington Hall, the English aristocratic mansion in which he’s worked his whole life. You come to find that he is not being entirely honest with himself. But Ishiguro skill as a writer allows emotion to pours through the pages in spite of this. The best book I’ve read in 2024 so far: a true masterpiece² of show-don’t-tell.
Introduction to H.264 Advanced Video Coding by Chen, Kao, and Lin (2006). A nice introduction to H.264, the preeminent video codec today: it’s probably built into the hardware of the computer you’re using to read this! Frames are broken up into macroblocks; each macroblock is encoded as a prediction plus a residual. There are three prediction modes: P(redicted)-type macroblocks are based on a macroblock from a previous frame, B(idirectional)-type macroblocks are based on a weighted average of macroblocks from multiple frames, and I(ntracoded)-type macroblocks are based on neighboring macroblocks in the same frame. P and B-type macroblocks have motion vectors that describe the offset from which to take their reference frame. For example, if a ball is rolling around, it could ideally be encoded as one I-type macroblock then a series of P-type macroblocks with motion vectors and small residuals. The residual is compressed using the discrete cosine transform, dropping higher-frequency signals (e.g. fine textures), similar to JPEG. It was interesting to learn that many more bits are used for luminance/Y (256) than chroma (64 for blue/U, 64 for red/V).

… and that’s it for now!

The denomination of which I am a member, the Presbyterian Church (USA), considers itself to belong to the Reformed/Calvinist tradition.

Mr. Ishiguro won the Nobel Prize in Literature in 2017, so unfortunately I can’t say I read him before he was cool.

Imagine you're ChatGPT...

You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.

— OpenAI’s system prompt for ChatGPT

Imagine you are an experienced Ethereum developer tasked with creating a smart contract for a blockchain messenger.

— a ChatGPT prompt found on the web

Peter Watts predicted LLM prompts in Blindsight (2006) and Echopraxia (2014):

Imagine you are Siri Keeton:

You wake in an agony of resurrection, gasping after a record-shattering bout of sleep apnea spanning one hundred forty days.

“Something’s coming,” she said at last. “Maybe not Siri.”
“Why do you say that?”
“It just sounds wrong the way it talks there are these tics in the speech pattern it keeps saying Imagine you’re this and Imagine you’re that and it sounds so recursive sometimes it sounds like it’s trying to run some kind of model…”
Imagine you’re Siri Keeton, he remembered. And gleaned from a later excerpt of the same signal: Imagine you’re a machine.
“It’s a literary affectation. He’s trying to be poetic. Putting yourself in the character’s head, that kind of thing.”

Corporate Processing Service scam

Received this official-looking document in the mail by virtue of having my address associated with my failed startup. If you look at the fine print you’ll notice it’s not actually from the government. It’s from a scam company called “Corporate Processing Service” that is generously offering to file a form for you for \$243.

The state only charges you \$25 and has an online form. See “Misleading Statement of Information Solicitations” on the California Secretary of State’s website.