Webapps chatlog - 2007-05-22

Back to overview|Highlight a word

14:44:27   --> Blogger (webapps-lo@polarboing.nvg.ntnu.no) has joined #webapps
14:44:27 irc.opera.com topic is: Web applications | Chatlogs of the <video> element chat at http://my.opera.com/WebApplications/blog/show.dml/943945#comment2818528 | No private messages and keep on topic
14:44:27 irc.opera.com Users on #webapps: Blogger Do`` aleksanteri Orcinus Ramunas DiscoStud Hal LarsKL dantesoft derenth Remco gmoz Orca OmegaPhil Toman- ~nicomen
14:45:37 Remco Cool
14:50:09   --> Blogger (webapps-lo@polarboing.nvg.ntnu.no) has joined #webapps
14:50:09 irc.opera.com topic is: Web applications | Chatlogs of the <video> element chat at http://my.opera.com/WebApplications/blog/show.dml/943945#comment2818528 | No private messages and keep on topic
14:50:09 irc.opera.com Users on #webapps: Blogger Do`` aleksanteri Orcinus Ramunas DiscoStud Hal LarsKL dantesoft derenth Remco gmoz Orca OmegaPhil Toman- ~nicomen
14:50:29 Ramunas * Blogger is blogging
14:50:31 Ramunas :P
14:51:45 aleksanteri hehe
14:51:54 nicomen weblog -> blog, so why not webapps log -> blog ;)
14:52:06   <-- nicomen has left #webapps
14:52:06   --> nicomen (nicomen@polarboing.nvg.ntnu.no) has joined #webapps
14:52:16 nicomen !pointer
14:52:24 nicomen Blogger: pointer
14:52:24 nicomen See http://webapps.opentheweb.org/2007-05-22#T14-52-24
14:52:42 aleksanteri aye cool :D
14:53:23 Ramunas when highlighting words, he ignores nicks, imho it should highlight them too
14:53:31 gmoz wtf bbq grass
14:53:40 Ramunas I mean if the word is also a nick, it doesn't get highlighted
14:53:45 gmoz Blogger: bbqgrass
14:53:45 gmoz I'm logging. I don't understand 'bbqgrass', gmoz. Try /msg Blogger help
14:54:05 aleksanteri haha
14:54:11 aleksanteri Blogger: how are you?
14:54:11 aleksanteri I'm logging. Sorry, searching removed.
14:54:29 gmoz Blogger: IT's GOOFY TIME!
14:54:29 gmoz I'm logging. I don't understand 'IT's GOOFY TIME!', gmoz. Try /msg Blogger help
14:54:44 gmoz "I'm loggin" is just so... stupid reply to everything :P
14:54:48 aleksanteri nicomen: you have a bug there... take alook at the log?
14:54:49 Ramunas Blogger: blog about gmoz
14:54:49 Ramunas I'm logging. I don't understand 'blog about gmoz', Ramunas. Try /msg Blogger help
14:54:57 aleksanteri Blogger: what's up?
14:54:57 aleksanteri I'm logging. Sorry, searching removed.
14:54:58 gmoz I'm loggin mah lazer
14:54:59 aleksanteri ........
14:55:06 nicomen aleksanteri: bug where?
14:55:16 nicomen ah he uses your nick?
14:55:19 aleksanteri yeah
14:55:25 nicomen yeah I will look into that
15:10:24   --> DrLaunch (knutremi@ti112210a080-6340.bb.online.no) has joined #webapps
15:10:29   <-- DrLaunch has quit (Connection reset by peer)
15:14:54 gmoz bleh
15:15:02 gmoz Since regex seems to hate wiki infoboxes
15:15:13 gmoz let's just remove 'em with traditional substr based approach :P
15:15:36 nicomen what's wiki infoboxes?
15:15:37   --> Do``2 (~Do3@dsl77-234-79-227.pool.tvnet.hu) has joined #webapps
15:16:24   --> pffYussupov (pffYussupo@ADSL-TPLUS-98-93.telecomplus.net) has joined #webapps
15:16:26 gmoz those boxes that are displayed on the right side of the article often containing random info
15:16:37 gmoz in the case of country or city, population and stuff like that
15:16:44 gmoz it's just a floated table
15:17:24   <-- Do`` has quit (Ping timeout)
15:17:24   -!- Do``2 is now known as Do``
15:18:04   --> DrLaunch (knutremi@ti112210a080-6340.bb.online.no) has joined #webapps
15:18:54 gmoz Grarh
15:19:02 gmoz They still remain ;o
15:19:11 gmoz I'm failing something now I'm quite sure...
15:19:23 gmoz argh
15:19:25 gmoz I'm an idiot
15:19:30 gmoz I just realized the problem :P
15:20:09 aleksanteri well?
15:20:19 gmoz Of course it won't remove the infobox when I'm doing the ops on the contents of the box :P
15:20:31 gmoz it's grabbing those when it should grab something that comes after it
15:21:18 gmoz well the regex isn't working even on the correct variable.. =)
15:22:19 gmoz and I think my other approach got stuck in an infinite loop :D
15:25:20 gmoz Yep :P
15:25:23 gmoz Go me
15:26:25 gmoz Oh.
15:26:33   -!- * aleksanteri is still hammering his mysqld...
15:26:34 gmoz What a stupid mistake
15:26:45 gmoz I was offsetting indexOf with the length of the <table thing
15:26:55 gmoz so it wasn't actually removing it at all, it was probably just removing the contents >_>
15:27:29 gmoz Yes
15:27:33 gmoz Now it works, yay
15:27:52 gmoz I changed the way my widget looks up stuff from wikipedia for better results
15:32:36 gmoz it's pretty easy to get better data from wiki by some simple removes
15:32:51 gmoz just remove all stuff in [ ] because they're mostly just some randomness
15:33:06 gmoz and remove everything in ( ) because they're just remarks, spelling, foreign names and other things
15:33:54 gmoz especially in the articles about asian and russian cities there are often the names in cyrillics, spelling guides and other things like that
15:34:03 gmoz remove them and you get much more useful data
15:35:43   --> Do``2 (~Do3@dsl77-234-82-24.pool.tvnet.hu) has joined #webapps
15:35:57   <-- DrLaunch has quit (Quit: Leaving.)
15:36:08   --> DrLaunch (knutremi@ti112210a080-6340.bb.online.no) has joined #webapps
15:37:02   --> letham (king_heart@adsl196-167-205-217-196.adsl196-15.iam.net.ma) has joined #webapps
15:37:31   <-- Do`` has quit (Ping timeout)
15:37:31   -!- Do``2 is now known as Do``
15:40:49   <-- DrLaunch has quit (Ping timeout)
15:42:47   --> DrLaunch (knutremi@ti112210a080-6340.bb.online.no) has joined #webapps
15:52:33   <-- letham has quit (Ping timeout)
16:00:38 gmoz hmm
16:00:50 gmoz /\(.+\)/g doesn't appear to like the article on Barcelona
16:01:03 gmoz it just eats the whole summary chapter :P
16:01:10 gmoz except for the first word
16:08:22 gmoz omg
16:08:36 gmoz some really smart person has commented Travel
16:08:41 gmoz this is great but wana know what i segest?
16:08:41 gmoz how about a widget that serches for hotel rooms and prices and star rateings?
16:08:43 gmoz like uhhh that websight
16:08:58 gmoz Gee... I really don't get what the HOTEL SEARCH does...
16:09:19 nicomen who are you talking to?
16:09:26 gmoz Just random ranting
16:09:34 gmoz I'm talking to the one who reads it
16:09:35 gmoz =)
16:10:22 gmoz My random coding-related talks I can explain with this: I sometimes find results to problems better after explaining it like I do
16:13:21   --> roentgen (Miranda@9FD7C41A.9FFD1279.3E189396.IP) has joined #webapps
16:16:25   --> letham (king_heart@adsl196-167-205-217-196.adsl196-15.iam.net.ma) has joined #webapps
16:16:33 gmoz hmm
16:16:51 gmoz I wonder what confused the regex so much in the Barcelona article
16:17:13 gmoz I replaced it with /\([^)]*\)/ which is basically the same..
16:18:00 nicomen did it span several lines?
16:18:17 gmoz Not sure
16:18:20 gmoz possibly
16:18:25 nicomen not sure?
16:19:02 gmoz I don't think it did but it's hard to say from the source-viewer in Opera :P
16:19:28 nicomen javascript:a="aaa\nbbb"; alert(a.match(/(.*)/)[1]);
16:20:03 nicomen javascript:a="aaa\nbbb"; alert(a.match(/(.*)/)[1]); alert(a.match(/([^)]*)/)[1]);
16:20:40 nicomen aaa
16:20:42 nicomen aaa
16:20:43 nicomen bbb
16:22:44 gmoz hm
16:22:48 gmoz intresting side-effect :P
16:23:43 gmoz but
16:23:52 gmoz javascript:a="aaa\nbbb"; alert(a.replace(/.*/g,''));
16:23:55 gmoz and you get nothingness
16:24:04 gmoz therefore the newline doesn't appear to apply in that case
16:24:04 nicomen yes
16:24:22 nicomen but try to run that without /g ;)
16:24:28 gmoz yeah was just about to mention that :P
16:24:46 nicomen and it's probably not nothingness either
16:24:49 nicomen but a single \n
16:24:55 gmoz :P
16:25:15 gmoz javascript:a="(aaa\nbbb)"; alert(a.replace(/\(.*\)/,''));
16:25:17 nicomen javascript:a="aaa\nbbb"; alert("<"+a.replace(/.*/g,'')+">");
16:25:18 gmoz doesn't replace anything at all
16:25:35 nicomen gmoz: exactly
16:25:45 gmoz that might actually be the problem with the regex that I tried to use with the table replace stuff
16:25:45 nicomen not even with /g
16:25:58 nicomen since "\n" is not a "."
16:26:04 nicomen so, like I said earlier
16:26:05 gmoz but I wonder why did it replace EVERYTHING in the barcelona-case
16:26:06 nicomen always use:
16:26:12 nicomen [\s\S] when you mean "."
16:26:17 nicomen in javascript
16:26:18 gmoz ah
16:26:21 nicomen regexps are broken ;/
16:26:24 gmoz I was confused because you were talking to ramunas :P
16:26:28 nicomen ok
16:27:09 Orcinus They need a "s" regexp flag in javascript, like they do in PHP and Perl
16:27:20 nicomen true that
16:27:21 gmoz yeah
16:27:28 aleksanteri the s flag=?
16:27:35 aleksanteri i mean what it does
16:27:47 Orcinus makes . also match \n
16:28:11 nicomen not really
16:28:13 aleksanteri ah
16:28:28 nicomen . matches \n in regular lines too in other languages
16:29:22 nicomen ah no you are right
16:29:37 nicomen it's the /m that does what I was going to say
16:30:44 gmoz and then the next problem
16:30:48   --> ROBOd (robod@86.34.246.154) has joined #webapps
16:30:49 gmoz tables inside tables :P
16:31:04 Orcinus wheels within wheels?
16:31:05 gmoz it's breaking my table-removal code in Baltimore article (at least)
16:31:37 nicomen you would need to do a recursive parser thing
16:31:46 nicomen remembering at which level you are in
16:31:49 nicomen sucksors
16:32:39 gmoz Yeah :P
16:32:44 Orcinus Just keep removing tables until there are none left
16:32:54 gmoz doesn't work
16:34:01 Orcinus do { var old = a; a.replace(/<table[\s\S]+?<\/table>/, ""); } while (old != a);
16:35:04 gmoz not sure if that'll work but I'll try
16:35:15 Orcinus add a /g modifier too
16:36:07 nicomen I should make such a thing and get over it once and for all
16:36:26 nicomen struggled with similar problem having nested [quote] in bbcode
16:36:46 nicomen luckily, I just convert to html, so I don't need to keep track of levels
16:37:12 Orcinus It's a simple thing with the ungreedy operator
16:37:26 gmoz Orcinus: doesn't wooork :P
16:37:31 Orcinus just keep matching the smallest sets of open and closing tags
16:37:36 nicomen [quote 1] [quote 2] [/quote 2] [/quote 1] => <div> <div> </div> </div>
16:38:01 gmoz do
16:38:02 gmoz  {
16:38:02 gmoz  var old = txt;
16:38:02 gmoz  txt = txt.replace(/<table[\s\S]+?<\/table>/g, "");
16:38:02 gmoz  }
16:38:04 gmoz  while (old != txt);
16:38:14 gmoz that's what I have and it doesn't work
16:38:34 nicomen Orcinus: <table><table></table></table> => removes the part inside (): (<table><table></table></table>)</table>
16:38:47 nicomen err
16:38:52 nicomen Orcinus: <table><table></table></table> => removes the part inside (): (<table><table></table>)</table>
16:39:10   <-- pffYussupov has quit (Quit: Every time I make a website using frames god kills a Pokemon or two)
16:40:08 nicomen and if was greedy:
16:40:42 nicomen <table 1><table 2></table 2></table 1><table 3></table 3> => you would remove all three tables
16:41:12 nicomen so in short, one would need to:
16:41:27 nicomen 1. find the <table> you want to get rid of
16:41:38 nicomen 2. operate on the string from that table and outwards
16:41:56 gmoz hmm
16:41:58 nicomen 3. add a counter on every <table> you find
16:42:06 gmoz I wonder if it would be easy to do a backwards table reverse
16:42:13 nicomen 4. skip that amount of </table>
16:42:24 Orcinus I forgot I was using lookahead assertions in my PHP code which does that
16:42:29 nicomen 5. find out where the next </table> is
16:42:46 nicomen 6. remove the part from the first <table> to the </table> found 5.
16:42:47 gmoz use lastIndexOf to find the last table and then just use indexOf to find the next </table> after it
16:42:50 gmoz repeat.
16:43:19 gmoz if it's with lastIndexOf, you can't accidentally remove teh parent table first I think
16:44:01 Orcinus gmoz: you could also use DOMParser and remove all table objects that way
16:44:10 Orcinus Wikipedia is XHTML Trans
16:44:16 Orcinus so valid XML
16:44:31 gmoz that actually might work
16:45:08 nicomen ah that's true
16:45:23 nicomen mark the table you want to get rid of by adding an id
16:45:33 nicomen put the document in a DOM tree (or use innerHTML)
16:46:04 nicomen then document.getElementById("your id").parentNode.removeChild(document.getElementById("your id"));
16:46:12 gmoz well, anything with class="infobox"
16:46:19 gmoz but it might have more than one class
16:47:23 Orcinus if (className.indexOf("infobox") > -1) BALEETED
16:47:40 gmoz yeah
16:53:04 gmoz wtf
16:53:11 gmoz Aptana decided to close all out of the blue
17:00:30 gmoz testing the reverse removal idea for fun =)
17:00:34 gmoz let's see if it works
17:01:43   <-- Ramunas has quit (Ping timeout)
17:02:51 gmoz ha!
17:02:52 gmoz it worked
17:03:04 gmoz and no depth checks either =)
17:03:27 Orcinus yay
17:04:46   --> Ramunas (Ramunas@88.119.35.243) has joined #webapps
17:04:51 gmoz http://iikeli.ath.cx:3000/revrem.js
17:04:55 gmoz it's so simple it's beautiful!
17:04:56 gmoz lol
17:05:31 gmoz you could probably get rid of the extra postErrors and the infinite loop check if you want
17:05:34 gmoz =)
17:07:49   <-- Ramunas has quit (Ping timeout)
17:09:59   --> Ramunas (Ramunas@78.56.79.101) has joined #webapps
17:13:29   --> MrMind (adie@61.5.16.113) has joined #webapps
17:18:04   <-- MrMind has quit (Ping timeout)
17:41:26   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:42:24   <-- nicomen_ has quit (Client exited)
17:42:27   --> redwine (redwine01@bbgw10.bdcom.net) has joined #webapps
17:43:01   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:43:09   <-- nicomen_ has quit (Quit: nicomen_)
17:44:08   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:44:39   <-- nicomen_ has quit (Client exited)
17:44:42   -!- redwine is now known as redwine001
17:44:57   <-- redwine001 has quit (Quit: redwine001)
17:44:58   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:45:17   <-- nicomen_ has quit (Quit: nicomen_)
17:46:19   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:46:28   <-- nicomen_ has quit (Quit: nicomen_)
17:46:41   --> nicomen_ (nicomen@b086c.studby.ntnu.no) has joined #webapps
17:47:27 aleksanteri nicomen_: Blogger doesn't log mode changes :P
17:47:30   <-- nicomen_ has quit (Quit: nicomen_)
17:47:42 nicomen should it?
17:47:57 aleksanteri well i think they belong to it
17:50:15 nicomen well i you say so ;)
17:51:24   <-- letham has quit (Client exited)
17:55:31   --> playgal (ashianee@41.207.129.28) has joined #webapps
17:55:58 Remco nicomen: Here to stay now?
17:56:37   <-- playgal has quit (Quit: playgal)
17:56:41 nicomen Remco: yes
17:56:53 Remco Ok
17:57:50 Remco You can come back in #opera and #lounge now
18:00:11 Remco It was a bit bouncy for a few people ;)
18:09:57 gmoz I'm tempted to write a JavaScript parser for wikipedia documents
18:21:14   --> prince (mp3_sammi@61.2.215.136) has joined #webapps
18:24:06   <-- prince has left #webapps
18:48:02   --> playgal (ashianee@41.207.129.28) has joined #webapps
18:48:18   <-- playgal has left #webapps
19:05:53   --> Ramunas_ (Ramunas@78.56.78.191) has joined #webapps
19:06:11   <-- Ramunas has quit (Killed (NickServ (GHOST command used by Ramunas_)))
19:06:16   -!- Ramunas_ is now known as Ramunas
19:11:22   <-- Ramunas has quit (Ping timeout)
19:11:53   --> Ramunas (Ramunas@88.119.36.222) has joined #webapps
19:22:26   --> letham (king_heart@adsl196-167-205-217-196.adsl196-15.iam.net.ma) has joined #webapps
19:23:22   --> Lars_G (lars@ippool.ifxnetworks.com.ve) has joined #webapps
19:32:41   <-- letham has left #webapps
19:44:53   --> kh (opera@80.191.250.245) has joined #webapps
19:53:23 nicomen Remco: no problem, I don't to hang at #opera and #lounge, it was just my other client
20:01:26   <-- aleksanteri has quit (Client exited)
20:02:10   --> aleksanteri (K2@masked-8A8EDC9.dhcp.inet.fi) has joined #webapps
20:11:32   <-- Lars_G has quit (Quit: Leaving)
20:32:03   <-- aleksanteri has quit (Quit: aleksanteri)
20:44:12   <-- kh has left #webapps
20:46:49   --> Do``2 (~Do3@dsl-77-234-76-26.pool.tvnet.hu) has joined #webapps
20:48:07   <-- Do`` has quit (Ping timeout)
20:48:07   -!- Do``2 is now known as Do``
20:56:14   <-- ROBOd has quit (Quit: http://www.robodesign.ro )
21:12:58   --> mecoun (mireksmejk@www.bonsaistudio.cz) has joined #webapps
21:13:50 mecoun http://s4.bitefight.cz/c.php?uid=26088
21:13:52   <-- mecoun has left #webapps
21:47:42   <-- LarsKL has quit (Quit: )
22:05:50   <-- Orcinus has quit (Connection reset by peer)
22:08:01   <-- DrLaunch has quit (Quit: Leaving.)
22:16:44   --> Orcinus (orca@router.evertz.com) has joined #webapps
22:19:25   <-- roentgen has quit (Ping timeout)
22:19:33   --> Do``2 (~Do3@dsl77-234-79-88.pool.tvnet.hu) has joined #webapps
22:21:31   <-- Do`` has quit (Ping timeout)
22:21:31   -!- Do``2 is now known as Do``
22:26:57   <-- Orcinus has left #webapps
22:28:03   <-- Do`` has quit (Ping timeout)
22:40:08   --> Do`` (~Do3@dsl77-234-82-142.pool.tvnet.hu) has joined #webapps
23:25:17   <-- Do`` has quit (Ping timeout)
23:35:11   --> Do`` (~Do3@dsl85-238-88-94.pool.tvnet.hu) has joined #webapps
23:49:26 gmoz meh
23:49:45 gmoz I so wonder why doesn't xhr return proper xml docs when querying travelwiki
23:49:55 gmoz wikitravel, even
23:50:17 gmoz it should be just as much as xhtml as wikipedia