Extract text, join lines, and enclose in quotes on a Mac

My work involves a lot of reading and note-taking, and I regularly extract key quotes from papers I’m reading to store with the reference in my reference manager (most recently, Zotero). Academic papers are often multi-column PDFs, which means if you copy a key quotation you can waste a lot of time formatting, particularly manually removing line breaks, then enclosing the quote in “quotation marks”. So for a long time I’ve used TextWrangler to semi-automate this process. It’s a simple AppleScript which takes the content of the clipboard, pastes it into TextWranger, replaces line breaks with spaces, encloses the text in quotation marks, then cuts it back to the clipboard:

tell application "TextWrangler"
	activate
	paste
	replace "\\n" using " " searching in text 1 of window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
	replace "(.*)" using "\\“\\1\\”" searching in text 1 of window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
	select text 1 of window 1
	cut selection
 end tell

This did the job, but it required that I (a) have TextWrangler open, (b) do a CMD + C, (c) switch to TextWrangler, (d) run the script (using a keyboard shortcut, of course!) and then (e) switch to my reference manager and paste the text block. I always thought there must be a faster and better way. It turns out there is, using just a few commands in the terminal!

The basis for this was this excellent tutorial on using the command line to manipulate the clipboard. Following the principles set out by Dave Kerr, I built a string of commands which I got working in Terminal:

pbpaste | tr "\r" " " | tr "\n" " " | sed -e 's/^/“/' -e 's/$/”/' | pbcopy

That was great, but I still needed to copy the text from my PDF, then switch to terminal and run the command, then switch to Zotero and paste it. Could I do it more efficiently? It turns out I could, with the help of this discussion on StackExchange: How to get the selected text into an AppleScript, without copying the text to the clipboard? For simplicity, I used the basic (slower) code to grab the selected text:

-- Copy selected text to clipboard:
tell application "System Events" to keystroke "c" using {command down}
delay 1 -- Without this, the clipboard may have stale data.

I then combined the two using an Automator workflow, mapped as a global service so I could call it from any application (Adobe Reader, Safari, Mail etc). But there was something wrong! For some reason, my nice curly quotes which worked fine in Terminal were being spat out of bash as ‚Äú rather than “…”. I spent ages trying to figure out what was causing it. I had a hunch it must be down to character encoding, but couldn’t find a solution anywhere. That is, until I stumbled across this forum page where someone was having a similar problem in Alfred. All credit to deanishe for the solution, which in the end was the simple need to specify that Bash should use Unicode encoding. In my case, that meant adding the following line to the top of my bash script:

export LANG=en_GB.UTF-8

Hey presto, a one-keyboard shortcut script to merge lines, add nice curly quotes, and place the output on the clipboard for me to paste wherever it needed to go :-)

Here’s the complete Automator Service, with the code snippets below:

Automator Screenshot

Service receives no input in any application [this ensures the service is always available]

Run AppleScript

on run {input, parameters}
	set the clipboard to ""
	-- Copy selected text to clipboard:
	tell application "System Events" to keystroke "c" using {command down}
	delay 0.2 -- Without this, the clipboard may have stale data.
	return input
end run

Run Shell Script

export LANG=en_GB.UTF-8
pbpaste | tr "\r" " " | tr "\n" " " | sed -e 's/^/“/' -e 's/$/”/' | pbcopy