Posted on 2023-08-26 · 8 min read · emacs
The term “S-expression” is not super accurate, and should be substituted with something like “semantic unit” instead, as I’m also talking about things that aren’t necessarily S-expressions as Emacs knows them. I mainly chose the term for brevity, and because it’s hopefully more familiar—and thus less scary—to the reader.I have to make a confession: I have an evil past—literally. Having switched to vanilla Emacs keybindings a while ago, one thing that I genuinely miss from that time are the
ca(motions, killing everything in or around the closest encompassing
()-environment. Luckily, the change-inner package provides exactly these commands for Emacs proper. Unluckily, there are some issues regarding whitespace handling—let’s try to fix that.
How it all started§After happily using change-inner for a few days, one of the first problems I ran into was the package’s flakiness with respect to whitespace. This is elucidated in, for example, this issue:
When using change-inner with rust-mode, the following code (withChange-inner as a package builds upon another excellent one from the same author: expand-region, an “Emacs extension to increase selected region by semantic units.” Essentially, change-inner just expands the region until it hits something that it’s happy with. As such, the problem eluded to above is with the respective expand-region functions that are called; specifically,
|as the cursor):callinglet issue_list_url = Url::parse(| "https://github.com/rust-lang/rust/issues?labels=E-easy&state=open" ).unwrap();
M-x change-inner (gives:whereas I would expect:let issue_list_url = Url::parse|.unwrap();It looks like it’s related to newlines. There’s a similar issue in JS:let issue_list_url = Url::parse(|).unwrap();// works here var foo = bar(|"baz"); // error: Couldn't find expansion var foo = bar(| "baz");
er/mark-inside-pairs, which is defined like so:
Notice the invocations of
(defun er/mark-inside-pairs () "Mark inside pairs (as defined by the mode), not including the pairs." (interactive) (when (er--point-inside-pairs-p) (goto-char (nth 1 (syntax-ppss))) (set-mark (save-excursion (forward-char 1) (skip-chars-forward er--space-str) ; ← HERE (point))) (forward-list) (backward-char) (skip-chars-backward er--space-str) ; ← HERE (exchange-point-and-mark)))
(skip-chars-forward er--space-str); if we start with
As is common, I will use
|to indicate the position of the point.
var foo = bar(| "baz");
M-x er/mark-inside-pairs RET, then the marked area will actually just be
"baz", instead of everything inside of the parentheses. Mystery solved, right? Maybe, but having to redefine that function for this package alone feels wrong to me. This got me looking into the internals of change-inner,
I wish I hadn’t.in order to see where the problem actually lies.
Inside change-inner§Taking a closer look at
change-inner*—the internal function doing the actual work—reveals the following. After some initial book keeping, the area surrounding the point is expanded, looking for the innermost expression matching the parameters:
q-charis the char that the user input, but quoted as a regular expression via
Crucially, the area is expanded twice completely unconditionally. This stops only once we’ve reached the bounds of either the buffer, or the expression we are interested in. The idea is to start in a situation like this
(er--expand-region-1) (er--expand-region-1) ; sic! (while (and (not (= (point) (point-min))) (not (looking-at q-char))) (er--expand-region-1))
and expand until we encompass the whole list
'( "one" "t|wo" "three" "four" )
As you’ve probably already guessed, the
^’s are supposed to signal the marked region.
noting that the previous step was
'( "one" "t|wo" "three" "four" ) ;^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
which equates to the “inner” part of the list. Then, one can call
'( "one" "t|wo" "three" "four" ) ; ^^^^^^^^^^^^^^^^^^^^^^^^^^^
er/contract-region, which relies on an expansion history, in order to only kill this inner part. Why expand twice unconditionally? Because in a situation like
The expansion would immediately encompass the whole list,
'|( "one" "two" "three" "four" )
As inand its innards wouldn’t be available to expand-region’s contraction history. The “trick” is to actually expand further than necessary; looping through the
'|( "one" "two" "three" "four" ) ; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whileabove until one inevitably hits
(point-min)and stops expanding. This triggers yet another bit of code that then recurses with prefilled arguments
search-forward-charis the second argument of
change-inner*; if the function was called with that, we have already recursed once, so stop.
charis the character that the user actually input.
starting-pointis the position of the point before anything happened.
During that additional run of the function, it searches for the correct delimiter via
(if (not (looking-at q-char)) (if search-forward-char (error "Couldn't find any expansion starting with %S" char) (goto-char starting-point) (setq mark-active nil) (change-inner* yank? char)) ;; … else … )
and—due to the way that
(search-forward char (point-at-eol))
search-forwardworks by default—we end up with the point directly after the opening delimiter
This now expands correctly.
'(| "one" "two" "three" "four" )
Actually, the searching also has a different, actual, use. When in a situation like
One might want to change the string—indeed,
'(1| 2 "this is a string")
M-x change-inner "correctly jumps to the string:
This is one of the great features of Vim’s
'(1 2 "|")
ci", and certainly something to preserve.
Puni to the rescue§I certainly know what I think of this solution. Instead of trying to fix this web of expansions and contractions, how about we rewrite the function instead? I’ve been happily using puni for a while, and it seems pretty apt for the job. Briefly, puni is a structured editing package, like paredit or smartparens, but it works for a broader range of languages than the former, while comprising of a much smaller code-base—and even fewer language-specific bits—than the latter.
Puni achieves this by relying on Emacs’s built-in functions.While I still prefer paredit for lisps, puni has become my de facto standard for language-agnostic parenthesis handling. Luckily for us, puni already comes equipped with a
puni-expand-regionfunction, so one can swiftly rewrite the core of
change-inner*using that instead of
Notice that the double expansion vanished! Instead, when a region was found, we can make use of
;; Try to find a region. (puni-expand-region) (when (> (point) (mark)) ; By default, puni jumps to the end of the sexp (exchange-point-and-mark)) (while (and (not (= (point) (point-min))) (not (looking-at q-char))) (puni-expand-region))
puni-bounds-of-list-around-pointto get the internals explicitly, and then calculate how big the delimiters were:
Trying this out with our trusty example of
(let* ((rb (region-beginning)) (re (region-end)) (insides (progn (goto-char (1+ rb)) (puni-bounds-of-list-around-point))) (olen (- (car insides) rb)) ; Length of opening delimiter (clen (- re (cdr insides)))) ; Length of closing delimiter (kill-region (+ rb olen) (- re clen)))
we… are greeted with a type error.
var foo = bar(| "baz");
Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil) puni--smaller-interval((103 . 108) (nil . 108))
Puni to the rescue?§The
puni--smaller-intervalfunction does some comparisons with
<=, and having
nilin there will obviously result in a bad time for everyone. As it turns out, puni also has some problems handling whitespace, in that it doesn’t skip it. At some point in
puni-expand-region, we call
puni-bounds-of-sexp-at-point, which tries to find out whether we are at the start or end of an S-expression by going forwards and backwards a few times:
Now, when we are in a situation like
(save-excursion (setq end-forward (puni-strict-forward-sexp) beg-forward (puni-strict-backward-sexp))) (save-excursion (setq beg-backward (puni-strict-backward-sexp) end-backward (puni-strict-forward-sexp)))
(| "furble"), an invocation of
puni-strict-forward-sexpwill leave us at
( "furble"|), but executing
puni-strict-backward-sexpafter that will result in
( |"furble")—not where we started. As such, puni will (incorrectly) conclude that we were not at the start of the expression. One could try to cram some whitespace handling into this, but who says we don’t run into other issues then?
The real reason, of course, is that I just wanted my code to work right now, instead of having to wait for upstream to fix something. At some point this should definitely be fixed in puni, though.In fact,
puni-expand-regionis written in such a way that it tries out different expansion strategies until one succeeds—why not just quiet the error?
This… turns out to work!
(advice-add 'puni-bounds-of-sexp-at-point :around (lambda (fun) (ignore-errors (fun))))
// before var foo = bar(| "baz"); // after var foo = bar(|);
The code§For anyone interested, here is the full code. It also includes a
modesetting, which can be set to
outer, in order to kill around the parentheses; e.g.,
For obvious reasons, I will not submit this upstream to change-inner, but it will instead live in my personal configuration.
// before let issue_list_url = Url::parse(|"https://my-url.com").unwrap(); // after let issue_list_url = Url::parse|.unwrap();
Using a recursive local function also incidentally fixes #9. Nice.
One can bind killing the innards to
(cl-defun slot/change-sexp (&key search-for mode) "Delete (the innards of) a sexp. Takes a char, like ( or \", and kills the first ancestor semantic unit starting with that char. The unit must be recognisable to `puni'. SEARCH-FOR is the opening delimiter to search for: if this is nil, prompt for one. MODE is whether to kill the whole region (`outer'), or just the innards of it (any other value, including nil)." (cl-labels ((expand (char &optional forward) "Expand until we encompass the whole expression." (let* ((char (or char (char-to-string (read-char (format "Kill %s:" (symbol-name (or mode 'inner))))))) (q-char (regexp-quote char)) (starting-point (point))) ;; Try to find a region. (puni-expand-region) (when (> (point) (mark)) (exchange-point-and-mark)) (while (and (not (= (point) (point-min))) (not (looking-at q-char))) (puni-expand-region)) ;; If we haven't found one yet, initiate a forward search and ;; try again—once. (when (not (looking-at q-char)) (goto-char starting-point) (deactivate-mark) (if forward (error "Couldn't find any expansion starting with %S" char) (search-forward char (pos-eol 2)) (expand char 'forward)))))) (expand search-for) ;; Now that we have a region, decide what to do with it. (let ((rb (region-beginning)) (re (region-end))) (if (eq mode 'outer) (kill-region rb re) ; Kill everything ;; If we want to delete inside the expression, fall back to `puni'. ;; This circumvents having to call `er--expand-region-1' and then ;; `er/contract-region' in some vaguely sensical order, and hoping ;; to recover the inner expansion from that. ;; Addresses ghub:magnars/change-inner.el#5 (let* ((insides (progn (goto-char (1+ rb)) (puni-bounds-of-list-around-point))) (olen (- (car insides) rb)) ; Length of opening delimiter (clen (- re (cdr insides)))) ; Length of closing delimiter (kill-region (+ rb olen) (- re clen)))))))
M-i, and killing everything to
M-o, as change-inner suggests. Alternatively, and this is what I do,
a second small helper function is swiftly written, such that only one keybinding is needed:
M-owill never be something other than
(defun slot/change-around (&optional arg) (interactive "P") (if arg (slot/change-sexp :mode 'outer) (slot/change-sexp))) (bind-key "M-i" #'slot/change-around)