8

If I cut some HTML from a Pidgin instant messenger window, I can easily paste it verbatim into a new HTML email in Thunderbird. All formatting (fonts, colors, etc) are preserved, so it appears that my Ubuntu 13.10 desktop clipboard must have the HTML source somewhere.

But I'd like to tweak the HTML source.

How can I actually get at the HTML source when it is in the clipboard? I'd like to just throw it into a text file, work on the markup in Vim or whatever, then use this HTML source in a webpage or feed it to Thunderbird's "Insert → HTML".

Hmm, maybe something like PasteImg (mentioned in Getting a graphic on the clipboard to the disk?), but using request_rich_text() instead of request_image()? I wouldn't mind using a small Python script the rare times I want to get HTML source from the clipboard.

What's in the clipboard may actually be "rich text".

The Python script from this answer outputs

Current clipboard offers formats: ('TIMESTAMP', 'TARGETS', 'MULTIPLE',
'SAVE_TARGETS', 'COMPOUND_TEXT', 'STRING', 'TEXT', 'UTF8_STRING', 'text/html',
'text/plain')

Turns out my Pidgin logs are in HTML, so that's one way to get at this HTML source, bypassing the clipboard entirely. I'm still interested in the answer to the original question (how to retrieve HTML from the clipboard).

Adam Monsen
  • 2,225
  • I don't understand pretty well. In the clipboard you have text plain. Why don't you just copy and past it? I think there is something I'm missing here. – Lucio Feb 28 '14 at 22:32
  • So you are copying the formatted HTML from pidgin - e.g. as bold instead of <strong>bold</strong>, but do you want to copy the HTML source so you can edit it, or do you want to edit after it was pasted into Thunderbird? – Wilf Feb 28 '14 at 22:41
  • @Wilf, Both. Editing HTML source after pasting in Thunderbird would be ok, but having access to the actual HTML source on the clipboard would be even better because I could use that in a webpage or whatever. – Adam Monsen Feb 28 '14 at 23:01

6 Answers6

6

Found it! Here's how to get at the HTML source when there's some on your clipboard:

#!/usr/bin/env python
import gtk
print (gtk.Clipboard().wait_for_contents('text/html')).data

This helped.

This didn't work for me. My callback was never entered.

Pablo Bianchi
  • 15,657
Adam Monsen
  • 2,225
  • 1
    i like this method best – Ace Mar 03 '14 at 23:43
  • Great, this is exactly what I need for Emacs org-mode: https://emacs.stackexchange.com/questions/12121/org-mode-parsing-rich-html-directly-when-pasting – xji Dec 21 '15 at 12:45
  • 2
    Doesn't seem to work on Python3/GTK3 though. xclip -selection clipboard -o -t text/html worked. – xji Jun 23 '18 at 10:22
1

I see what you're trying to get at. Try pasting into something that takes wysiwyg, edit it there, then copy paste into Thunderbird?

Maybe bluegriffon or libreoffice writer will work.

Ace
  • 316
  • If pasting it in does not allow it, you need to get to the HTML source first... you could search the folders in .thunderbird for a draft email an edit it. – Wilf Mar 01 '14 at 00:16
  • Good ideas, thanks! See my recent answer--I figured out how to do it with PyGTK. – Adam Monsen Mar 03 '14 at 07:13
1

Here's a modification of your script, which actually allows editing the html directly.

It also handles problems with character encoding: if you reply to an e-mail from someone using Windows, chances are the encoding is stuck in UTF-16, which is no good for editing. You may need to install the chardet module in Python.

Replace 'vi' with your text editor of choice, in subprocess.call(....

#!/usr/bin/env python
import gtk
import chardet
import os
import getopt
import subprocess

dtype = 'text/html'
htmlclip = gtk.Clipboard().wait_for_contents(dtype).data
encoding = chardet.detect(htmlclip)['encoding']

# Shove the clipboard to a temporary file
tmpfn = '/tmp/htmlclip_%i' % os.getpid()

with open (tmpfn, 'w') as editfile:
   editfile.write(htmlclip.decode(encoding))

# Manually edit the temporary file
subprocess.call(['vi', tmpfn])

with open (tmpfn, 'r') as editfile:
   htmlclip = editfile.read().encode(encoding)

# Put the modified data back to clipboard
gtk.Clipboard().set_with_data(
      [(dtype,0,0)],
      lambda cb, sd, info, data: sd.set(dtype, 8, htmlclip),
      lambda cb, d: None )
gtk.Clipboard().set_can_store([(dtype,0,0)])
gtk.Clipboard().store()

This does a full edit cycle, modifying the clipboard “in place”.

I use it to make up for Thunderbird's annoying lack of a html-editor feature:

  1. select all with ctrl+a in the message composing window
  2. ctrl+c
  3. run the above script, which opens an editor –
    1. make your changes to the html source
    2. save and quit
  4. ctrl+v in the compose window overwrites the entire content with your html-edited version.
  • Maybe this works in thunderbird but I have problems to even re-paste copied unedited text (in my case in confluence) – yatsek Apr 09 '19 at 08:36
0

There is an answer at which directly uses the xclip utility:

xclip -selection clipboard -o -t text/html

I find it the easiest and most reliable now, since GTK3/Python3 seems to have introduced some changes that broke the original answer.

Pablo Bianchi
  • 15,657
xji
  • 642
0

To get the HTML source code of the "rich text":

# Get the HTML source to stdio
xclip -selection clipboard -t text/html -o

You have several TARGETs, like text/plain (default) to get just the rendered text.

I manipulate the plain/text content of the clipboard using xsel:

xsel -b | something | something_else | xsel -b

Pretty useful to for e.g. change a list to a TOP n cases, with sort | uniq -c | sort -rn | head -20.

In your case change the first xsel -b with the above xclip command.

Pablo Bianchi
  • 15,657
0

I don't really use Thunderbird, but you should be able to use an extension to help edit the source in Thnderbird - e.g:

https://addons.mozilla.org/en-us/thunderbird/addon/edit-html-source/ not been updated for ages
https://addons.mozilla.org/en-us/thunderbird/addon/stationery/

Wilf
  • 30,194
  • 17
  • 108
  • 164
  • 2
    "Edit HTML Source" wasn't compatible with my version of Thunderbird. I did try Stationery, but I couldn't figure out how to view/edit HTML source. – Adam Monsen Mar 01 '14 at 01:12