Archive | February, 2012

Regex to the Rescue

29 Feb

Problem:

At work, we had a requirement to transform text URL’s into clickable links. Ideally, these links would have been entered in as well formatted hyper links, but that wasn’t the case and we didn’t have the option to implement that fix. Initially, the idea was have end users include the http:// protocol on links so only text would that had http:// would be transformed. This was rather optimistic as end users didn’t follow this training. Users were entering links in all shapes and forms. Some links had no protocol or host. This lead to many links being missed.

Solution:

To solve the problem, I wrote a little regex to capture the links and then reformatted them. The regex is below:

b((https?|ftp)://)?([A-Z|a-z|0-9|-]+[.]){1,4}(com|org|us|net|edu)([A-Z|a-z|./?=_&%|0-9]+)?b

This will match a link with or without a protocol or host specified, an alphanumeric domain with dashes, and a path and query string following it. It only matches on the TLD listed in the middle section. This can be problematic or useful depending on what you want to match. Overall, this was a major improvement that allowed end users to continue with their behavior and still get the result we were looking for. This solution has been rock solid so far in capturing the links entered, but it still has the opportunity to miss certain URL’s. YMMV.

Advertisement

GetJar is logging more than you think

26 Feb

I was doing some network analysis on my phone related to another matter, and noticed that GetJar was logging some of my activity. This isn’t that surprising for an app store that provides free apps. Nothing comes for free. I would expect that they would log some information related to their apps provided through their store. However, what surprised me was logging occurred as I was uninstalling apps that I did not purchase or install through GetJar. After seeing this behavior, GetJar got an immediate uninstall. I don’t know what other data GetJar might have been logging as I didn’t leave it on long enough to find out anymore.

 

Here is what was logged:

GET /backchannel/metadata/
?gjClientInstallationID=<24char string>
&androidID=<44 char string>
&gjClientVerCode=3378
&src=gjca
&gjClientVerName=3.3.78
&packageName=com.qik.android
&status=UNINSTALLED
&versionCode=382
&versionName=0.03.765
&appLabel=Qik
&uninstallTime=1329696276134

 

I have reformatted this GET request for easier reading. The character count is based on the decoded URL. There is nothing super personal in there, but they are definitely collect what apps you are using.

 

I took a quick look at GetJar’s privacy policy to see if this was disclosed. As many privacy policies, the sections on personal information collection are a bit vague and open-ended. Even so, I didn’t get the sense that they would be collecting information on what apps I was uninstalling or using. Here’s the relevant excerpt from their privacy policy:

Personal Information Collected via Technology

As you use the GetJar Site or any GetJar Service, some information may also be collected passively, including your Internet protocol address, browser type, access time, mobile phone model, and telecom carrier. We may also store a small text file called a "Cookie" on your computer or phone to store certain information about your use of the GetJar Site or GetJar Services. We may use both session Cookies (which expire once you close your browser) and persistent Cookies (which stay on your computer or phone until you delete them).

Personal Information from Other Sources

We may receive Personal Information about you from other sources, including other users. We may associate this information with the other Personal Information we have collected about you.

 

 

I went on to take a quick look at their logging server eventlogger.getjar.com. It discloses some configuration information which I am not sure how accurate it is. If the information disclosed is to be trusted, the jetty.config.contextMap seems to give an indication on what else is collected or sent to GetJar.

/*=com.getjar.els.servlet.StatusServlet;
/thrift/*=com.getjar.els.servlet.ThriftServerServlet;
/backchannel/messaging/*=com.getjar.els.servlet.BackchannelMessagingServlet;
/backchannel/usage/*=com.getjar.els.servlet.BackchannelUsageServlet;
/backchannel/metadata/*=com.getjar.els.servlet.BackchannelMetadataServlet;
/backchannel/event/*=com.getjar.els.servlet.BackchannelEventServlet;
/20110506/4933/backchannel/usage/*=com.getjar.els.servlet.BackchannelUsageServlet;
/20111010/5001/backchannel/usage/*=com.getjar.els.servlet.BackchannelUsageServlet;
/20111102/5002/backchannel/usage/*=com.getjar.els.servlet.BackchannelUsageServlet

* Reformatted for easier reading.

It appears that messaging, usage, and event details might be logged as well. What those all entail I’m not sure as uninstalling an app fell under Metadata.

Disclaimer: By writing this, I’m not claiming that GetJar is engaging in malicious activities. If anything, I want others to be aware of this and make an informed decision. No one is being forced to use this app so choose to do what you will.