Opened 13 years ago

Closed 11 years ago

Last modified 11 years ago

#8141 closed defect (fixed)

&# construction in URL mistakenly interpreted as HTML entity

Reported by: bakert Owned by: earthmkii
Milestone: Adium 1.4 Component: Adium Core
Version: Severity: normal
Keywords: Cc:
Patch Status: Accepted


If a querystring in a URL pasted into adium ends in & and is followed by an anchor that is an integer the link will not work.

Steps to reproduce:


  • &#49224 is interpreted as HTML entity 쁈 - one of the "Hangul Syllables" (you can see it by mousing over the link or clicking it) - this stops the link from working.


This is a valid URL according to RFC 2396.

Adium Version: 1.1.3 (Sunday Sept 30th 2007) OS X Version: 10.4.10

Attachments (1)

patch.diff (1.5 KB) - added by Pavel Safronov 11 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 13 years ago by Jordan

Milestone: Adium X 1.2.1

Hahaha, yep it changes the end of the link into a Hangul Syllable like you said. Tested with 1.2svn via Yahoo though I'm sure it's the same for all.

comment:2 Changed 13 years ago by phy1729

I can't replicate this Adium 1.2 Mac 10.4.11. Tried sending to myself by AIM and through GTalk (web sending; Adium receiving).

comment:3 Changed 13 years ago by phy1729

Oddly it does show as a Hangul Syllable in the chat viewer.

comment:4 Changed 13 years ago by Jordan

I can still reproduce when sending via Adium - I beleive it's because it's not escaped before being rendered by WebKit in the message view.

comment:5 Changed 13 years ago by Evan Schoenberg

Milestone: Adium X 1.2.1Good idea for "later"

This is pretty nontrivial to fix, I think. I welcome a patch to fix it from someone who understands HTML and encoding/decoding!

comment:6 Changed 13 years ago by phy1729

It's saved correctly in the log as with the amp escaped so it's a problem in the rendering. The parser should check for the ending semicolon before it converts the entity. Also it would seem that the parser is run twice because the amp seems to be included in the entity when displayed. If you tell me where to look I'd be happy to look at it.

comment:7 Changed 13 years ago by Evan Schoenberg

Great! I think that the most likely culprit is -[AIHTMLDecoder decodeHTML:withDefaultAttributes:], perhaps at line 1352.

comment:8 Changed 13 years ago by phy1729

Which language is the code written in? I think in AIHTMLDecoder.m line 1339 captures   along with &nbsp which is a problem all entities are to be encapsulated with an & and a ; and amp should be unescaped last because it starts an entity so   will be recognized as   an then later made an non-breaking space. If you tell me the language, it looks like C, I can try and learn it and be more helpful.

comment:9 Changed 13 years ago by Evan Schoenberg

That explanation sounds pretty reasonable to me. The language of Adium (and of most Mac OS X programs) is Objective-C, also called Cocoa (though really there is a subtly different meaning to 'Cocoa'). Objective-C is a superset of C.

ObjCFun is a pretty brief introduction to the syntax; it may be enough to get you started if you're familiar with C. TooDarkPark has a more in-depth article. Both those link's are from Chris's page about getting started with Cocoa.

comment:10 Changed 13 years ago by Evan Schoenberg

links, not link's.

comment:11 Changed 13 years ago by phy1729

The best thing I can think to do would be to change line 1339 to a regex conditional. It think [&[0x[0-9a-f]+|#[0-9]+|[A-Za-z0-9]+];]+? will work but regex can be slow so there is most likely a better solution than mine possibly code our own.

Thanks for the links evands also Mac's XCode help is good.

comment:12 Changed 12 years ago by Carlos Morales

comment:13 Changed 12 years ago by Robert

Milestone: Good idea for "later"Adium bugs

Changed 11 years ago by Pavel Safronov

Attachment: patch.diff added

comment:14 Changed 11 years ago by Robert

Milestone: Adium bugsAdium 1.4
Patch Status: Needs Dev Review

comment:15 Changed 11 years ago by Stephen Holt

Owner: changed from nobody to Stephen Holt

comment:16 Changed 11 years ago by Robert

Patch Status: Needs Dev ReviewAccepted
Resolution: fixed
Status: newclosed

comment:17 Changed 11 years ago by Stephen Holt <sholt@…>

(In 07e07f9788a6) Undo the part of [3855f70905bd]: It caused strings to be incorrectly escaped and causing our XML parsing to break. Rather, escape hash-symbols when parsing in XML data as not to confuse AIHTMLDecoder. Refs #8141. Fixes #12856.

Note: See TracTickets for help on using tickets.