Adium

Ticket #8141 (closed defect: fixed)

Opened 2 years ago

Last modified 7 months ago

&# construction in URL mistakenly interpreted as HTML entity

Reported by: bakert Owned by: earthmkii
Milestone: Adium 1.4 Component: Adium Core
Version: Severity: normal
Keywords: Cc:
Patch Status: Accepted

Description

If a querystring in a URL pasted into adium ends in & and is followed by an anchor that is an integer the link will not work.

Steps to reproduce:

Actual:

  • &#49224 is interpreted as HTML entity 쁈 - one of the "Hangul Syllables" (you can see it by mousing over the link or clicking it) - this stops the link from working.

Expected:

This is a valid URL according to RFC 2396.

Adium Version: 1.1.3 (Sunday Sept 30th 2007) OS X Version: 10.4.10

Attachments

patch.diff (1.5 KB) - added by terminus 7 months ago.

Change History

Changed 2 years ago by jas8522

  • milestone set to Adium X 1.2.1

Hahaha, yep it changes the end of the link into a Hangul Syllable like you said. Tested with 1.2svn via Yahoo though I'm sure it's the same for all.

Changed 2 years ago by phy1729

I can't replicate this Adium 1.2 Mac 10.4.11. Tried sending to myself by AIM and through GTalk (web sending; Adium receiving).

Changed 2 years ago by phy1729

Oddly it does show as a Hangul Syllable in the chat viewer.

Changed 2 years ago by jas8522

I can still reproduce when sending via Adium - I beleive it's because it's not escaped before being rendered by WebKit in the message view.

Changed 2 years ago by evands

  • milestone changed from Adium X 1.2.1 to Good idea for "later"

This is pretty nontrivial to fix, I think. I welcome a patch to fix it from someone who understands HTML and encoding/decoding!

Changed 2 years ago by phy1729

It's saved correctly in the log as with the amp escaped so it's a problem in the rendering. The parser should check for the ending semicolon before it converts the entity. Also it would seem that the parser is run twice because the amp seems to be included in the entity when displayed. If you tell me where to look I'd be happy to look at it.

Changed 2 years ago by evands

Great! I think that the most likely culprit is -[AIHTMLDecoder decodeHTML:withDefaultAttributes:], perhaps at line 1352.

Changed 2 years ago by phy1729

Which language is the code written in? I think in AIHTMLDecoder.m line 1339 captures   along with &nbsp which is a problem all entities are to be encapsulated with an & and a ; and amp should be unescaped last because it starts an entity so   will be recognized as   an then later made an non-breaking space. If you tell me the language, it looks like C, I can try and learn it and be more helpful.

Changed 2 years ago by evands

That explanation sounds pretty reasonable to me. The language of Adium (and of most Mac OS X programs) is Objective-C, also called Cocoa (though really there is a subtly different meaning to 'Cocoa'). Objective-C is a superset of C.

 ObjCFun is a pretty brief introduction to the syntax; it may be enough to get you started if you're familiar with C.  TooDarkPark has a more in-depth article. Both those link's are from Chris's page about  getting started with Cocoa.

Changed 2 years ago by evands

links, not link's.

Changed 2 years ago by phy1729

The best thing I can think to do would be to change line 1339 to a regex conditional. It think [&[0x[0-9a-f]+|#[0-9]+|[A-Za-z0-9]+];]+? will work but regex can be slow so there is most likely a better solution than mine possibly code our own.

Thanks for the links evands also Mac's XCode help is good.

Changed 22 months ago by djmori

Changed 17 months ago by Robby

  • milestone changed from Good idea for "later" to Adium bugs

Changed 7 months ago by terminus

Changed 7 months ago by Robby

  • patch_status set to Needs Dev Review
  • milestone changed from Adium bugs to Adium 1.4

Changed 7 months ago by earthmkii

  • owner changed from nobody to earthmkii

Changed 7 months ago by Robby

  • status changed from new to closed
  • patch_status changed from Needs Dev Review to Accepted
  • resolution set to fixed

Changed 7 months ago by Stephen Holt <sholt@…>

(In  07e07f9788a6) Undo the part of [3855f70905bd]: It caused strings to be incorrectly escaped and causing our XML parsing to break. Rather, escape hash-symbols when parsing in XML data as not to confuse AIHTMLDecoder. Refs #8141. Fixes #12856.

Note: See TracTickets for help on using tickets.