Opened 12 years ago

Closed 10 years ago

Last modified 10 years ago

#8141 closed defect (fixed)

&# construction in URL mistakenly interpreted as HTML entity

Reported by: bakert Owned by: earthmkii
Milestone: Adium 1.4 Component: Adium Core
Version: Severity: normal
Keywords: Cc:
Patch Status: Accepted


If a querystring in a URL pasted into adium ends in & and is followed by an anchor that is an integer the link will not work.

Steps to reproduce:


  • &#49224 is interpreted as HTML entity 쁈 - one of the "Hangul Syllables" (you can see it by mousing over the link or clicking it) - this stops the link from working.


This is a valid URL according to RFC 2396.

Adium Version: 1.1.3 (Sunday Sept 30th 2007)
OS X Version: 10.4.10

Attachments (1)

patch.diff (1.5 KB) - added by terminus 10 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 12 years ago by jas8522

  • Milestone set to Adium X 1.2.1

Hahaha, yep it changes the end of the link into a Hangul Syllable like you said. Tested with 1.2svn via Yahoo though I'm sure it's the same for all.

comment:2 Changed 12 years ago by phy1729

I can't replicate this Adium 1.2 Mac 10.4.11. Tried sending to myself by AIM and through GTalk (web sending; Adium receiving).

comment:3 Changed 12 years ago by phy1729

Oddly it does show as a Hangul Syllable in the chat viewer.

comment:4 Changed 12 years ago by jas8522

I can still reproduce when sending via Adium - I beleive it's because it's not escaped before being rendered by WebKit in the message view.

comment:5 Changed 12 years ago by evands

  • Milestone changed from Adium X 1.2.1 to Good idea for "later"

This is pretty nontrivial to fix, I think. I welcome a patch to fix it from someone who understands HTML and encoding/decoding!

comment:6 Changed 12 years ago by phy1729

It's saved correctly in the log as with the amp escaped so it's a problem in the rendering. The parser should check for the ending semicolon before it converts the entity. Also it would seem that the parser is run twice because the amp seems to be included in the entity when displayed. If you tell me where to look I'd be happy to look at it.

comment:7 Changed 12 years ago by evands

Great! I think that the most likely culprit is -[AIHTMLDecoder decodeHTML:withDefaultAttributes:], perhaps at line 1352.

comment:8 Changed 12 years ago by phy1729

Which language is the code written in? I think in AIHTMLDecoder.m line 1339 captures   along with &nbsp which is a problem all entities are to be encapsulated with an & and a ; and amp should be unescaped last because it starts an entity so   will be recognized as   an then later made an non-breaking space. If you tell me the language, it looks like C, I can try and learn it and be more helpful.

comment:9 Changed 12 years ago by evands

That explanation sounds pretty reasonable to me. The language of Adium (and of most Mac OS X programs) is Objective-C, also called Cocoa (though really there is a subtly different meaning to 'Cocoa'). Objective-C is a superset of C.

ObjCFun is a pretty brief introduction to the syntax; it may be enough to get you started if you're familiar with C. TooDarkPark has a more in-depth article. Both those link's are from Chris's page about getting started with Cocoa.

comment:10 Changed 12 years ago by evands

links, not link's.

comment:11 Changed 12 years ago by phy1729

The best thing I can think to do would be to change line 1339 to a regex conditional. It think [&[0x[0-9a-f]+|#[0-9]+|[A-Za-z0-9]+];]+? will work but regex can be slow so there is most likely a better solution than mine possibly code our own.

Thanks for the links evands also Mac's XCode help is good.

comment:12 Changed 12 years ago by djmori

comment:13 Changed 11 years ago by Robby

  • Milestone changed from Good idea for "later" to Adium bugs

Changed 10 years ago by terminus

comment:14 Changed 10 years ago by Robby

  • Milestone changed from Adium bugs to Adium 1.4
  • Patch Status set to Needs Dev Review

comment:15 Changed 10 years ago by earthmkii

  • Owner changed from nobody to earthmkii

comment:16 Changed 10 years ago by Robby

  • Patch Status changed from Needs Dev Review to Accepted
  • Resolution set to fixed
  • Status changed from new to closed

comment:17 Changed 10 years ago by Stephen Holt <sholt@…>

(In 07e07f9788a6) Undo the part of [3855f70905bd]: It caused strings to be incorrectly escaped and causing our XML parsing to break. Rather, escape hash-symbols when parsing in XML data as not to confuse AIHTMLDecoder. Refs #8141. Fixes #12856.

Note: See TracTickets for help on using tickets.