Vesuvius April 3, 2012 IRC about GSoC
Listed here because official logging didn't capture this session.
[#sahana-meeting channel | Note: This channel is logged by sahanamtglog | Logs available at http://logs.sahanafoundation.org/sahana-meeting] [12:01] == glennp [820efe19@gateway/web/freenode/ip.130.14.254.25] has joined #sahana-meeting [12:01] <Ramindu> hi, sophron! [12:01] <sophron> hello everybody! [12:01] <glennp> Hi, all [12:02] <triune> <~ Greg Miernicki here :) [12:02] <triune> g'day all! [12:02] <sophron> triune: hello Greg, nice to meet you :) [12:02] <triune> hello sophron [12:03] <glennp> Shall we start with Ramindu's proposal? [12:04] <sophron> Lan will join us today or not? [12:04] <glennp> ...about multiple language support, just submitted to melange. I just read it over a moment ago. [12:04] <glennp> Let me remind Lan... [12:06] <glennp> ... done. [12:06] <Ramindu> so, moste esteemed mentors, what sayeth thou? [12:06] <glennp> Looks good generally. Just 2 areas of concern for me [12:06] <triune> I sayeth, I have not readeth it yet :) [12:07] <glennp> 1 is, that you & Greg already touched on I believe, is what happens now that Google Translate is no longer a free service. [12:07] <Ramindu> triune, interesting... ;) glennp, I'm listening... [12:08] <triune> doh, I don't think I'm added as a mentor in Melange yet [12:08] <glennp> The funding model for Google translate is pay per word, which is problematic for us [12:08] <glennp> Management here doesn't like unbounded costs. [12:09] <Ramindu> hrm hem... [12:09] <Ramindu> it's $20 per 1million chars [12:09] <Ramindu> char count on Vesuvius strings? [12:09] <Ramindu> one-time caching? [12:10] <triune> Glenn: it also looks like you cannot simply do monthly / quarterly billing with the API either ~ having a credit card and Google Wallet are a must [12:10] <glennp> A model where you pay a fixed amount per month or year and get a max # of words would work better for us. [12:10] <glennp> So we may have to look for alternatives to Google for this. [12:11] <triune> or we could add a CC and buy a set amount of quota and remove the card ~ [12:11] <Ramindu> so... a good solution (either replacement or some sort of clever ploy where we know the amount we have to pay beforehand) for Google Translate needs to be part of the solution? [12:11] <glennp> Definitely need caching on any paid service, and tracking of what changed and needs to be retranslated. [12:12] <glennp> Ploys are good if you can find 'em. [12:12] <Ramindu> I've heard they're quick on their feet, hard to catch. [12:12] <glennp> Blend into the forest too [12:13] <glennp> Other issue was: [12:13] <glennp> You're model for Resource Pages was based on changing the default locale for the installation. [12:15] <glennp> That's OK, but I'm envisioning a broader system where a particular page is pretranslated into multiple languages (differs per event), and the user selects [12:15] <Ramindu> yes, this would mean loading the pre-translated strings in to the database at the point of installation [12:15] <Ramindu> oh I see [12:15] <glennp> So, for example, for the Haiti earthquake, the "How to I search" page is available in Eng, French, Creole, and the visitor chooses a language from a pulldown. [12:15] <Ramindu> I guess keeping pre-translated texts in the database isn't a big problem, right? [12:16] <glennp> Yeah, not a big problem. Just need to flag which language for each entry. [12:16] <glennp> So that's a new column [12:16] <Ramindu> so simply this would mean considering the session variable 'locale' and applying the relevant pre-translated string, if available [12:17] <Ramindu> that sounds even more doable and within reach that what I suggested in my proposal ;) [12:17] <Ramindu> *than [12:17] <glennp> There would still be a default locale for the installation. [12:18] <Ramindu> of course, that will be considered as well [12:18] <Ramindu> triune, the session var will be set to the default locale until the user specifies otherwise through the dropdown, right? [12:19] <glennp> He says right [12:19] <triune> yes, that is correct... [12:19] <Ramindu> triune is AFK? [12:19] <triune> had to momentarily run over to the printer! :D [12:20] <glennp> He flits around like a ploy :-> [12:20] <Ramindu> haha :D [12:20] <glennp> That's all I've got as feedback. [12:21] <glennp> If you want to research Google Translate alternatives, would be good. [12:21] <Ramindu> hmmm... [12:21] <Ramindu> is it ok if I look for a way to use it while minimizing cost? [12:22] <glennp> Sure [12:22] <glennp> Maybe some Translate/Pootle hybrid [12:23] <triune> perhaps also keep in mind a way to calculate the cost as well... ie. how many translations will need to be made (per language, per person records, per _t(), etc) [12:23] <Ramindu> uhm... we already have a Translate/Pootle hybrid :D [12:23] <triune> this way we can estimate how much of a translate api quota we would need to purchase [12:24] <glennp> Ramindu: yeah, even as typed that I thought -doh [12:24] <Ramindu> I see... [12:24] <Ramindu> there's a lot of potential here... when the code is looping through person records etc, we need to translate the labels (e.g. 'name') only once [12:25] <Ramindu> unlike passing raw HTML to Translate, like what's being done now, it could be more intelligent [12:25] <Ramindu> so even if we're using Translate only once and then caching it, cost would still be minimized [12:26] <glennp> It would help to do trickle translation in advance for known strings, maybe prioritized by language. [12:27] <glennp> So January we do Chinese, February we do Japanese, etc. [12:27] <glennp> Spread out costs over time [12:27] <Ramindu> that's a great idea! [12:28] <triune> pre-caching the missing strings from the .po/.mo files is a good idea I had not considered [12:29] <glennp> That's only 1 way to slice it. Another: January we do "how do I search" in 140 languages, February we do "how do I report" in 140 languages, etc. [12:30] <glennp> Have to have some way to keep track of all this without going nuts, in the face of changes to the underlying English [12:30] <Ramindu> ok now we have 3 nice ideas we can implement to minimize the financial pain of using Google Translate: [12:31] <Ramindu> 1. We check *.po for untranslated strings, cache GTranslations of those [12:31] <Ramindu> 2. We do monthly trickle-translations of one language per month, while building methodology to track changes to English UI strings [12:32] <Ramindu> 3. (this one is for the real efficiency nuts) Intelligent handling of strings with recurring strings within the same page/other pages being sent to GTranslate only once [12:33] <Ramindu> you do realize this is going to make my proposal uber-long? :D [12:34] <glennp> Sure. Some of this latter stuff may just be to create a design, rather than implementation [12:35] <glennp> sophron: Lan just phoned me, regrets she can't successfully connect to chat from her mobile. [12:35] <sophron> oh [12:35] <sophron> ok that's fine [12:36] <sophron> i guess you could enlight me with some issues on my proposal ;) [12:36] <glennp> I suggest you create a google doc, or submit a prelim melange proposal, and Lan can give you feedback on that. As can we. [12:37] <sophron> i will do that (maybe today) [12:37] <glennp> But any enlightenment triune & i can give you now ? [12:37] <sophron> yes please [12:37] <Ramindu> so triune and glennp, I guess we should wrap up for today, and give sophron a chance to discuss his proposal? [12:38] *Ramindu* yup [12:38] <Ramindu> I'll update my proposal and drop you a mail [12:38] <glennp> Thanks, Ramindu. [12:38] <triune> not sure when I will be approved as a mentor, if you want timely feedback... google doc for me :) [12:38] <Ramindu> triune, mine is on a google doc as well, I'll link you [12:39] <glennp> I'll remind Mark & David about that approval. [12:40] <sophron> ok i am starting with some questions [12:40] <sophron> i believe the first thing that needs to be done is to give the capability to the user to determine the data he wants to export [12:40] <sophron> eg per hospital or per event [12:41] <sophron> and later on start working the exporting of the data [12:41] <glennp> or a particular hospital x event [12:42] <sophron> also give the capability to the user to specify timeranges [12:42] <sophron> yes [12:42] <glennp> hospital x event x time, yes [12:43] <sophron> i will work this probably by using some GET variables [12:43] <sophron> eg [12:43] <sophron> http://www.nlm.nih.gov/stats?hospital=foo&event=bar [12:44] <sophron> sounds good? [12:45] <triune> I think a post that automatically initiates a download would be better [12:45] <sophron> that means no online posting [12:46] <triune> you mean link sharing? [12:46] <sophron> yes [12:47] <triune> well, the export portion of the stats module is not available to the public, you are generally a site admin, or hospital staff, so already logged in [12:47] <triune> not really thinking there will ever need to be link sharing [12:48] <sophron> hmm i see ;) [12:48] <glennp> Not so sure. We could want reporters to see some of this data [12:48] <triune> I would like to see more ajax to be honest... so an XHR to iniated a download would be my preferred method :) [12:49] <triune> *initiate [12:49] <sophron> that's nice! [12:49] <sophron> i love AJAX so i prefer the XHR method [12:49] <triune> glenn: if that is true, could be tie it into a simple REST web service? [12:49] <triune> *we [12:50] <glennp> Maybe [12:50] <triune> or even make it part of the SOAP services... [12:50] <glennp> But then the user would have to have a client [12:51] <triune> sorry if I'm asking more questions than answering... this concept is more of Glenn's brainchild [12:51] <triune> if the user knows how to handle XML, I think they are pretty intelligent already :) [12:51] <sophron> well i am not speaking only for XML data [12:52] <glennp> I think for some public events the export should be available to the public. It's not just XML, includes Excel, csv [12:52] <sophron> but for the exports the plugin does already [12:53] <triune> so the goal is to mimic a pfif repository but offer more formats? [12:54] <glennp> And more focused on aggregate stats, not per-person data [12:54] <glennp> Just like the charing module [12:55] <triune> sorry if I confused the two... I know we were talking about exporting person data as well as stats for some time... [12:55] <sophron> why is this information private in the first place? [12:56] <glennp> It may or may not be [12:56] <sophron> yes i know i mean for the nih [12:56] <glennp> If it's gathered by a particular hospital, they need some control over the release. [12:57] <sophron> i see [12:57] <triune> we handle public disasters as well as private hospital events [12:57] <glennp> Yeah, makes it complicated. [12:57] <glennp> For a hospital, they might say, release the daily aggregates, but not the arrival rate data (I'm making this up) [12:58] <sophron> ok i'll propose an online posting of the results and we can discuss it again [12:58] <sophron> i hope this is ok [12:59] <glennp> We probably won't nail down these hospital-policy issues during GSoC 2012 [12:59] <glennp> Yeah, sounds good. [12:59] <sophron> :) [12:59] <sophron> so i limited the formats for export [13:00] <sophron> the ones i got are: [13:00] <sophron> KML channel for geolocated data [13:01] <sophron> RSS/Atom feed and XML / JSON formats [13:01] <sophron> do you think i should add / remove something? [13:01] <glennp> (random thought RE expressing time ranges: as you saw for kml from triagepic, it might be convenient to show the user choices time choices like "today", "last 24 hours", "anytime") [13:02] <glennp> Most important in real world: comma-separated, tab-separated, Excel [13:02] <sophron> yes the time choices sound great [13:02] <sophron> yes i will add those too :) [13:03] <sophron> and a last question [13:03] <glennp> Those 3 work when its a flat table format [13:04] <glennp> XML better if hetereogeneous [13:04] <sophron> do you think there should be a public Google map page based on the generated KML [13:04] <sophron> forget the 'public' [13:04] <sophron> i mean apart from the export of KML [13:05] <sophron> a page with Google map using it [13:05] <glennp> I know the disaster community has a sense of "globes" for posting such info. [13:05] <sophron> eg take a look at this mockup: http://sophron.latthi.com/gsoc.png [13:05] <sophron> the map should be disaster based ofcourse [13:06] <glennp> triune: you had some interest in reinvigorating maps within PL/Vesuvius [13:06] <triune> yes, I do love data visualized on maps :) [13:07] <triune> unfortunately, we don't have a lot of that data to work with yet [13:07] <sophron> i believe it would work well with the KML data ;) [13:07] <glennp> Gets into situational awareness. Good stuff. Unclear how much infrastructure we would have in a GSoC 2012 timeframe to support that within PL [13:09] <sophron> ok so i should add it on my proposal as optional depending on the time i have [13:09] <sophron> is that ok? [13:09] <glennp> Used to be some code in Sahana PHP, but kinda got broken during Google to Open Maps transition, & we dropped it. [13:10] <glennp> Yeah, make it end-of-summer, maybe more a design than implementation. Don't want it to eat the actual export part. [13:11] <sophron> nice! [13:11] <glennp> Anything else? [13:12] <sophron> that's all by me [13:12] <sophron> anything else you can advise me? [13:12] <sophron> i mean if you have something else to say for me to mention on my proposal [13:13] <glennp> Just don't miss the Friday deadline to get it in, Google will cut no slack! [13:13] <sophron> yes, i will probably submit it by tommorow [13:13] <sophron> thanks a lot Greg and Glen [13:13] <sophron> it was nice talking to you [13:14] <sophron> :) [13:14] <glennp> Looking forward to it. Thanks, sophron. Bye. [13:14] <triune> no problem, thanks for your interest in Sahana!