Posted on 2023-08-26 · 8 min read · emacs
The term “S-expression” is not super accurate,
and should be substituted with something like “semantic unit” instead,
as I’m also talking about things that aren’t necessarily S-expressions as Emacs knows them.
I mainly chose the term for brevity, and because it’s hopefully more familiar—and thus less scary—to the reader.
I have to make a confession:
I have an evil past—literally.
Having switched to vanilla Emacs keybindings a while ago,
one thing that I genuinely miss from that time are the ci(
and ca(
motions,
killing everything in or around the closest encompassing ()
-environment.
Luckily, the change-inner package provides exactly these commands for Emacs proper.
Unluckily, there are some issues regarding whitespace handling—let’s try to fix that.
How it all started§
After happily using change-inner for a few days, one of the first problems I ran into was the package’s flakiness with respect to whitespace. This is elucidated in, for example, this issue:When using change-inner with rust-mode, the following code (withChange-inner as a package builds upon another excellent one from the same author: expand-region, an “Emacs extension to increase selected region by semantic units.” Essentially, change-inner just expands the region until it hits something that it’s happy with. As such, the problem eluded to above is with the respective expand-region functions that are called; specifically,|
as the cursor):callinglet issue_list_url = Url::parse(| "https://github.com/rust-lang/rust/issues?labels=E-easy&state=open" ).unwrap();M-x change-inner (
gives:whereas I would expect:let issue_list_url = Url::parse|.unwrap();It looks like it’s related to newlines. There’s a similar issue in JS:let issue_list_url = Url::parse(|).unwrap();// works here var foo = bar(|"baz"); // error: Couldn't find expansion var foo = bar(| "baz");
er/mark-inside-pairs
,
which is defined like so:
(defun er/mark-inside-pairs ()
"Mark inside pairs (as defined by the mode), not including the pairs."
(interactive)
(when (er--point-inside-pairs-p)
(goto-char (nth 1 (syntax-ppss)))
(set-mark (save-excursion
(forward-char 1)
(skip-chars-forward er--space-str) ; ← HERE
(point)))
(forward-list)
(backward-char)
(skip-chars-backward er--space-str) ; ← HERE
(exchange-point-and-mark)))
(skip-chars-forward er--space-str)
;
if we start withAs is common,
I will use
|
to indicate the position of the point.var foo = bar(|
"baz");
M-x er/mark-inside-pairs RET
,
then the marked area will actually just be "baz"
,
instead of everything inside of the parentheses.
Mystery solved, right?
Maybe, but having to redefine that function for this package alone
feels wrong to me.
This got me looking into the internals of change-inner,I wish I hadn’t.
in order to see where the problem actually lies.
Inside change-inner§
Taking a closer look atchange-inner*
—the internal function doing the actual work—reveals the following.
After some initial book keeping,
the area surrounding the point is expanded,
looking for the innermost expression matching the parameters:
q-char
is the char that the user input,
but quoted as a regular expression via regexp-quote
.(er--expand-region-1)
(er--expand-region-1) ; sic!
(while (and (not (= (point) (point-min)))
(not (looking-at q-char)))
(er--expand-region-1))
'( "one" "t|wo" "three" "four" )
As you’ve probably already guessed,
the
^
’s are supposed to signal the marked region.'( "one" "t|wo" "three" "four" )
;^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
'( "one" "t|wo" "three" "four" )
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^
er/contract-region
,
which relies on an expansion history,
in order to only kill this inner part.
Why expand twice unconditionally?
Because in a situation like
'|( "one" "two" "three" "four" )
As in
and its innards wouldn’t be available to expand-region’s contraction history.
The “trick” is to actually expand further than necessary;
looping through the '|( "one" "two" "three" "four" )
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
while
above until one inevitably hits (point-min)
and stops expanding.
This triggers yet another bit of code that then recurses with prefilled arguments
search-forward-char
is the second argument of change-inner*
;
if the function was called with that,
we have already recursed once, so stop.
char
is the character that the user actually input.
starting-point
is the position of the point before anything happened.(if (not (looking-at q-char))
(if search-forward-char
(error "Couldn't find any expansion starting with %S" char)
(goto-char starting-point)
(setq mark-active nil)
(change-inner* yank? char))
;; … else …
)
(search-forward char (point-at-eol))
search-forward
works by default—we end up with the point directly after the opening delimiter
'(| "one" "two" "three" "four" )
Actually,
the searching also has a different, actual, use.
When in a situation like
One might want to change the string—indeed,
This is one of the great features of Vim’s
'(1| 2 "this is a string")
M-x change-inner "
correctly jumps to the string:
'(1 2 "|")
ci"
,
and certainly something to preserve.Puni to the rescue§
I certainly know what I think of this solution. Instead of trying to fix this web of expansions and contractions, how about we rewrite the function instead? I’ve been happily using puni for a while, and it seems pretty apt for the job. Briefly, puni is a structured editing package, like paredit or smartparens, but it works for a broader range of languages than the former, while comprising of a much smaller code-base—and even fewer language-specific bits—than the latter.Puni achieves this by relying on Emacs’s built-in functions.
While I still prefer paredit for lisps,
puni has become my de facto standard for language-agnostic parenthesis handling.
Luckily for us,
puni already comes equipped with a puni-expand-region
function,
so one can swiftly rewrite the core of change-inner*
using that instead of er--expand-region-1
:
;; Try to find a region.
(puni-expand-region)
(when (> (point) (mark)) ; By default, puni jumps to the end of the sexp
(exchange-point-and-mark))
(while (and (not (= (point) (point-min)))
(not (looking-at q-char)))
(puni-expand-region))
puni-bounds-of-list-around-point
to get the internals explicitly,
and then calculate how big the delimiters were:
(let* ((rb (region-beginning))
(re (region-end))
(insides (progn (goto-char (1+ rb))
(puni-bounds-of-list-around-point)))
(olen (- (car insides) rb)) ; Length of opening delimiter
(clen (- re (cdr insides)))) ; Length of closing delimiter
(kill-region (+ rb olen) (- re clen)))
var foo = bar(|
"baz");
Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
puni--smaller-interval((103 . 108) (nil . 108))
Puni to the rescue?§
Thepuni--smaller-interval
function does some comparisons with <=
,
and having nil
in there will obviously result in a bad time for everyone.
As it turns out, puni also has some problems handling whitespace,
in that it doesn’t skip it.
At some point in puni-expand-region
,
we call puni-bounds-of-sexp-at-point
,
which tries to find out whether we are at the start or end of an S-expression
by going forwards and backwards a few times:
(save-excursion
(setq end-forward (puni-strict-forward-sexp)
beg-forward (puni-strict-backward-sexp)))
(save-excursion
(setq beg-backward (puni-strict-backward-sexp)
end-backward (puni-strict-forward-sexp)))
(| "furble")
,
an invocation of puni-strict-forward-sexp
will leave us at ( "furble"|)
,
but executing puni-strict-backward-sexp
after that will result in ( |"furble")
—not where we started.
As such, puni will (incorrectly) conclude that we were not at the start of the expression.
One could try to cram some whitespace handling into this,
but who says we don’t run into other issues then?The real reason,
of course,
is that I just wanted my code to work right now,
instead of having to wait for upstream to fix something.
At some point this should definitely be fixed in puni, though.
In fact, puni-expand-region
is written in such a way
that it tries out different expansion strategies until one succeeds—why not just quiet the error?
(advice-add 'puni-bounds-of-sexp-at-point :around
(lambda (fun)
(ignore-errors (fun))))
// before
var foo = bar(|
"baz");
// after
var foo = bar(|);
The code§
For anyone interested, here is the full code. It also includes amode
setting, which can be set to outer
,
in order to kill around the parentheses; e.g.,
// before
let issue_list_url = Url::parse(|"https://my-url.com").unwrap();
// after
let issue_list_url = Url::parse|.unwrap();
Using a recursive local function also incidentally fixes #9.
Nice.
(cl-defun slot/change-sexp (&key search-for mode)
"Delete (the innards of) a sexp.
Takes a char, like ( or \", and kills the first ancestor semantic
unit starting with that char. The unit must be recognisable to
`puni'.
SEARCH-FOR is the opening delimiter to search for: if this is
nil, prompt for one. MODE is whether to kill the whole
region (`outer'), or just the innards of it (any other value,
including nil)."
(cl-labels
((expand (char &optional forward)
"Expand until we encompass the whole expression."
(let* ((char (or char
(char-to-string
(read-char (format "Kill %s:"
(symbol-name
(or mode 'inner)))))))
(q-char (regexp-quote char))
(starting-point (point)))
;; Try to find a region.
(puni-expand-region)
(when (> (point) (mark))
(exchange-point-and-mark))
(while (and (not (= (point) (point-min)))
(not (looking-at q-char)))
(puni-expand-region))
;; If we haven't found one yet, initiate a forward search and
;; try again—once.
(when (not (looking-at q-char))
(goto-char starting-point)
(deactivate-mark)
(if forward
(error "Couldn't find any expansion starting with %S" char)
(search-forward char (pos-eol 2))
(expand char 'forward))))))
(expand search-for)
;; Now that we have a region, decide what to do with it.
(let ((rb (region-beginning))
(re (region-end)))
(if (eq mode 'outer)
(kill-region rb re) ; Kill everything
;; If we want to delete inside the expression, fall back to `puni'.
;; This circumvents having to call `er--expand-region-1' and then
;; `er/contract-region' in some vaguely sensical order, and hoping
;; to recover the inner expansion from that.
;; Addresses ghub:magnars/change-inner.el#5
(let* ((insides (progn (goto-char (1+ rb))
(puni-bounds-of-list-around-point)))
(olen (- (car insides) rb)) ; Length of opening delimiter
(clen (- re (cdr insides)))) ; Length of closing delimiter
(kill-region (+ rb olen) (- re clen)))))))
M-i
,
and killing everything to M-o
,
as change-inner suggests.
Alternatively, and this is what I do,M-o
will never be something other than other-window
.(defun slot/change-around (&optional arg)
(interactive "P")
(if arg
(slot/change-sexp :mode 'outer)
(slot/change-sexp)))
(bind-key "M-i" #'slot/change-around)