Vesuvius April 3, 2012 IRC about GSoC

Listed here because official logging didn't capture this session.

[#sahana-meeting channel | Note: This channel is logged by sahanamtglog | Logs available at http://logs.sahanafoundation.org/sahana-meeting]
[12:01] == glennp [820efe19@gateway/web/freenode/ip.130.14.254.25] has joined #sahana-meeting
[12:01] <Ramindu> hi, sophron!
[12:01] <sophron> hello everybody!
[12:01] <glennp> Hi, all
[12:02] <triune> <~ Greg Miernicki here :)
[12:02] <triune> g'day all!
[12:02] <sophron> triune: hello Greg, nice to meet you :)
[12:02] <triune> hello sophron
[12:03] <glennp> Shall we start with Ramindu's proposal?
[12:04] <sophron> Lan will join us today or not?
[12:04] <glennp> ...about multiple language support, just submitted to melange.  I just read it over a moment ago.
[12:04] <glennp> Let me remind Lan...
[12:06] <glennp> ... done.
[12:06] <Ramindu> so, moste esteemed mentors, what sayeth thou?
[12:06] <glennp> Looks good generally.  Just 2 areas of concern for me
[12:06] <triune> I sayeth, I have not readeth it yet :)
[12:07] <glennp> 1 is, that you & Greg already touched on I believe, is what happens now that Google Translate is no longer a free service.
[12:07] <Ramindu> triune, interesting... ;) glennp, I'm listening...
[12:08] <triune> doh, I don't think I'm added as a mentor in Melange yet
[12:08] <glennp> The funding model for Google translate is pay per word, which is problematic for us
[12:08] <glennp> Management here doesn't like unbounded costs.
[12:09] <Ramindu> hrm hem...
[12:09] <Ramindu> it's $20 per 1million chars
[12:09] <Ramindu> char count on Vesuvius strings?
[12:09] <Ramindu> one-time caching?
[12:10] <triune> Glenn: it also looks like you cannot simply do monthly / quarterly billing with the API either ~ having a credit card and Google Wallet are a must
[12:10] <glennp> A model where you pay a fixed amount per month or year and get a max # of words would work better for us.
[12:10] <glennp> So we may have to look for alternatives to Google for this.
[12:11] <triune> or we could add a CC and buy a set amount of quota and remove the card ~
[12:11] <Ramindu> so... a good solution (either replacement or some sort of clever ploy where we know the amount we have to pay beforehand) for Google Translate needs to be part of the solution?
[12:11] <glennp> Definitely need caching on any paid service, and tracking of what changed and needs to be retranslated.
[12:12] <glennp> Ploys are good if you can find 'em.
[12:12] <Ramindu> I've heard they're quick on their feet, hard to catch.
[12:12] <glennp> Blend into the forest too
[12:13] <glennp> Other issue was:
[12:13] <glennp> You're model for Resource Pages was based on changing the default locale for the installation.
[12:15] <glennp> That's OK, but I'm envisioning a broader system where a particular page is pretranslated into multiple languages (differs per event), and the user selects
[12:15] <Ramindu> yes, this would mean loading the pre-translated strings in to the database at the point of installation
[12:15] <Ramindu> oh I see
[12:15] <glennp> So, for example, for the Haiti earthquake, the "How to I search" page is available in Eng, French, Creole, and the visitor chooses a language from a pulldown.
[12:15] <Ramindu> I guess keeping pre-translated texts in the database isn't a big problem, right?
[12:16] <glennp> Yeah, not a big problem.  Just need to flag which language for each entry.
[12:16] <glennp> So that's a new column
[12:16] <Ramindu> so simply this would mean considering the session variable 'locale' and applying the relevant pre-translated string, if available
[12:17] <Ramindu> that sounds even more doable and within reach that what I suggested in my proposal ;)
[12:17] <Ramindu> *than
[12:17] <glennp> There would still be a default locale for the installation.
[12:18] <Ramindu> of course, that will be considered as well
[12:18] <Ramindu> triune, the session var will be set to the default locale until the user specifies otherwise through the dropdown, right?
[12:19] <glennp> He says right
[12:19] <triune> yes, that is correct...
[12:19] <Ramindu> triune is AFK?
[12:19] <triune> had to momentarily run over to the printer! :D
[12:20] <glennp> He flits around like a ploy  :->
[12:20] <Ramindu> haha :D
[12:20] <glennp> That's all I've got as feedback.
[12:21] <glennp> If you want to research Google Translate alternatives, would be good.
[12:21] <Ramindu> hmmm...
[12:21] <Ramindu> is it ok if I look for a way to use it while minimizing cost?
[12:22] <glennp> Sure
[12:22] <glennp> Maybe some Translate/Pootle hybrid
[12:23] <triune> perhaps also keep in mind a way to calculate the cost as well... ie. how many translations will need to be made (per language, per person records, per _t(), etc)
[12:23] <Ramindu> uhm... we already have a Translate/Pootle hybrid :D
[12:23] <triune> this way we can estimate how much of a translate api quota we would need to purchase
[12:24] <glennp> Ramindu: yeah, even as typed that I thought -doh
[12:24] <Ramindu> I see...
[12:24] <Ramindu> there's a lot of potential here... when the code is looping through person records etc, we need to translate the labels (e.g. 'name') only once
[12:25] <Ramindu> unlike passing raw HTML to Translate, like what's being done now, it could be more intelligent
[12:25] <Ramindu> so even if we're using Translate only once and then caching it, cost would still be minimized
[12:26] <glennp> It would help to do trickle translation in advance for known strings, maybe prioritized by language.
[12:27] <glennp> So January we do Chinese, February we do Japanese, etc.
[12:27] <glennp> Spread out costs over time
[12:27] <Ramindu> that's a great idea!
[12:28] <triune> pre-caching the missing strings from the .po/.mo files is a good idea I had not considered
[12:29] <glennp> That's only 1 way to slice it.  Another:  January we do "how do I search" in 140 languages, February we do "how do I report" in 140 languages, etc.
[12:30] <glennp> Have to have some way to keep track of all this without going nuts, in the face of changes to the underlying English
[12:30] <Ramindu> ok now we have 3 nice ideas we can implement to minimize the financial pain of using Google Translate:
[12:31] <Ramindu> 1. We check *.po for untranslated strings, cache GTranslations of those
[12:31] <Ramindu> 2. We do monthly trickle-translations of one language per month, while building methodology to track changes to English UI strings
[12:32] <Ramindu> 3. (this one is for the real efficiency nuts) Intelligent handling of strings with recurring strings within the same page/other pages being sent to GTranslate only once
[12:33] <Ramindu> you do realize this is going to make my proposal uber-long? :D
[12:34] <glennp> Sure.  Some of this latter stuff may just be to create a design, rather than implementation
[12:35] <glennp> sophron: Lan just phoned me, regrets she can't successfully connect to chat from her mobile.
[12:35] <sophron> oh
[12:35] <sophron> ok that's fine
[12:36] <sophron> i guess you could enlight me with some issues on my proposal ;)
[12:36] <glennp> I suggest you create a google doc, or submit a prelim melange proposal, and Lan can give you feedback on that.  As can we.
[12:37] <sophron> i will do that (maybe today)
[12:37] <glennp> But any enlightenment triune & i can give you now  ?
[12:37] <sophron> yes please
[12:37] <Ramindu> so triune and glennp, I guess we should wrap up for today, and give sophron a chance to discuss his proposal?
[12:38] *Ramindu* yup
[12:38] <Ramindu> I'll update my proposal and drop you a mail
[12:38] <glennp> Thanks, Ramindu.
[12:38] <triune> not sure when I will be approved as a mentor, if you want timely feedback... google doc for me :)
[12:38] <Ramindu> triune, mine is on a google doc as well, I'll link you
[12:39] <glennp> I'll remind Mark & David about that approval.
[12:40] <sophron> ok i am starting with some questions
[12:40] <sophron> i believe the first thing that needs to be done is to give the capability to the user to determine the data he wants to export
[12:40] <sophron> eg per hospital  or per event
[12:41] <sophron> and later on start working the exporting of the data
[12:41] <glennp> or a particular hospital x event
[12:42] <sophron> also give the capability to the user to specify timeranges
[12:42] <sophron> yes
[12:42] <glennp> hospital x event x time, yes
[12:43] <sophron> i will work this probably by using some GET variables
[12:43] <sophron> eg
[12:43] <sophron> http://www.nlm.nih.gov/stats?hospital=foo&event=bar
[12:44] <sophron> sounds good?
[12:45] <triune> I think a post that automatically initiates a download would be better
[12:45] <sophron> that means no online posting
[12:46] <triune> you mean link sharing?
[12:46] <sophron> yes
[12:47] <triune> well, the export portion of the stats module is not available to the public, you are generally a site admin, or hospital staff, so already logged in
[12:47] <triune> not really thinking there will ever need to be link sharing
[12:48] <sophron> hmm i see ;)
[12:48] <glennp> Not so sure.  We could want reporters to see some of this data
[12:48] <triune> I would like to see more ajax to be honest... so an XHR to iniated a download would be my preferred method :)
[12:49] <triune> *initiate
[12:49] <sophron> that's nice!
[12:49] <sophron> i love AJAX so i prefer the XHR method
[12:49] <triune>  glenn: if that is true, could be tie it into a simple REST web service?
[12:49] <triune> *we
[12:50] <glennp> Maybe
[12:50] <triune> or even make it part of the SOAP services...
[12:50] <glennp> But then the user would have to have a client
[12:51] <triune> sorry if I'm asking more questions than answering... this concept is more of Glenn's brainchild
[12:51] <triune> if the user knows how to handle XML, I think they are pretty intelligent already :)
[12:51] <sophron> well i am not speaking only for XML data
[12:52] <glennp> I think for some public events the export should be available to the public.  It's not just XML, includes Excel, csv
[12:52] <sophron> but for the exports the plugin does already
[12:53] <triune> so the goal is to mimic a pfif repository but offer more formats?
[12:54] <glennp> And more focused on aggregate stats, not per-person data
[12:54] <glennp> Just like the charing module
[12:55] <triune> sorry if I confused the two... I know we were talking about exporting person data as well as stats for some time...
[12:55] <sophron> why is this information private in the first place?
[12:56] <glennp> It may or may not be
[12:56] <sophron> yes i know i mean for the nih
[12:56] <glennp> If it's gathered by a particular hospital, they need some control over the release.
[12:57] <sophron> i see
[12:57] <triune> we handle public disasters as well as private hospital events
[12:57] <glennp> Yeah, makes it complicated.
[12:57] <glennp> For a hospital, they might say, release the daily aggregates, but not the arrival rate data (I'm making this up)
[12:58] <sophron> ok i'll propose an online posting of the results and we can discuss it again
[12:58] <sophron> i hope this is ok
[12:59] <glennp> We probably won't nail down these hospital-policy issues during GSoC 2012
[12:59] <glennp> Yeah, sounds good.
[12:59] <sophron> :)
[12:59] <sophron> so i limited the formats for export
[13:00] <sophron> the ones i got are:
[13:00] <sophron> KML channel for geolocated data
[13:01] <sophron>  RSS/Atom feed and XML / JSON formats
[13:01] <sophron> do you think i should add / remove something?
[13:01] <glennp> (random thought RE expressing time ranges:  as you saw for kml from triagepic, it might be convenient to show the user choices time choices like "today", "last 24 hours", "anytime")
[13:02] <glennp> Most important in real world:  comma-separated, tab-separated, Excel
[13:02] <sophron> yes the time choices sound great
[13:02] <sophron> yes i will add those too :)
[13:03] <sophron> and a last question
[13:03] <glennp> Those 3 work when its a flat table format
[13:04] <glennp> XML better if hetereogeneous
[13:04] <sophron> do you think there should be a public Google map page based on the generated KML
[13:04] <sophron> forget the 'public'
[13:04] <sophron> i mean apart from the export of KML
[13:05] <sophron> a page with Google map using it
[13:05] <glennp> I know the disaster community has a sense of "globes" for posting such info.
[13:05] <sophron> eg take a look at this mockup: http://sophron.latthi.com/gsoc.png
[13:05] <sophron> the map should be disaster based ofcourse
[13:06] <glennp> triune: you had some interest in reinvigorating maps within PL/Vesuvius
[13:06] <triune> yes, I do love data visualized on maps :)
[13:07] <triune> unfortunately, we don't have a lot of that data to work with yet
[13:07] <sophron> i believe it would work well with the KML data ;)
[13:07] <glennp> Gets into situational awareness.  Good stuff.  Unclear how much infrastructure we would have in a GSoC 2012 timeframe to support that within PL
[13:09] <sophron> ok so i should add it on my proposal as optional depending on the time i have
[13:09] <sophron> is that ok?
[13:09] <glennp> Used to be some code in Sahana PHP, but kinda got broken during Google to Open Maps transition, & we dropped it.
[13:10] <glennp> Yeah, make it end-of-summer, maybe more a design than implementation.  Don't want it to eat the actual export part.
[13:11] <sophron> nice!
[13:11] <glennp> Anything else?
[13:12] <sophron> that's all by me
[13:12] <sophron> anything else you can advise me?
[13:12] <sophron> i mean if you have something else to say for me to mention on my proposal
[13:13] <glennp> Just don't miss the Friday deadline to get it in, Google will cut no slack!
[13:13] <sophron> yes, i will probably submit it by tommorow
[13:13] <sophron> thanks a lot Greg and Glen
[13:13] <sophron> it was nice talking to you
[13:14] <sophron> :)
[13:14] <glennp> Looking forward to it.  Thanks, sophron.  Bye.
[13:14] <triune> no problem, thanks for your interest in Sahana!

QR Code
QR Code agasti:vesuvius:gsoc2012:apr3irc (generated for current page)