Connection Geography JS

Last Edit: 2019-05-05
Languages used: JavaScript
APIs used: Google Maps API (v3, JavaScript), LinkedIn (internal) API
Supported platforms: Firefox
Status: In Progress
GitHub Repo: ConnectionGeographyJS
Connection-Geography JS (Example Data)
Connection-Geography JS (Example Data)

In 2014-2015, I wrote a small web app to plot LinkedIn connections/contacts on a map, with relative densities. I quite liked it, and it did a good job until LinkedIn removed some of their public APIs in 2015. With the primary data provider gone and no immediate alternatives available for personal use, this initial prototype was retired.

However, in software, almost any problem can be solved or worked around given enough time. With that in mind, I decided to try and resurrect this project, this time within the LinkedIn page itself.

Step 0: Establishing Scope

Before digging too deep into any one hole, I decided to figure out what problems needed to be solved and what was already “done”.

The first round of this was:

  • Does the majority of the old JavaScript work?
  • What effort is needed to acquire connection data (and is it within terms of service)?

Legacy Code

To my delight, the greater part of the old code worked with few changes; cull the old PHP, create a new index.html to test in, update the API keys & permissions, update the API bootstraping, add some mock connection data. Still chunky, but less than a full rewrite.

This is not to say it was without fault; the code structure and style was a bit underdeveloped and documentation not as through as I liked. Then again, that’s developing with legacy code is mostly.

Finding a Data Provider

When it comes time getting the new data, this is where it gets tricky. Originally I had thought about extracting it from the DOM of search results, but this is fragile and becomes tedious over multiple pages. Then I thought, “Wait! They’ve got an internal API with JSON, I’m planning to insert code into the page; I might as well use their API.”

To do this, I needed to figure out how to extract their Cross-Site Request Forgery (CSRF) token. I looked into a couple avenues before settling on the document.cookie which had it readily available.

As for legality, reading through the LinkedIn’s terms and services I found it disallowed data scrapping job postings and granting 3rd party access to scrape. This project deals with only connections/contacts, is used only by the current user (me), and doesn’t store the data already available to the page.

As of writing, it seems everything was above board and with the two broader parts deemed plausible, it went from idea to in progress.

Step 1: Injecting Code into the Page

Digging deeper, I found the next hurdle to be a joy of security and the bane of UserScripts; Content Security Policy (CSP). These lovely headers block scripts from running in certain contexts, which is great for preventing Cross Site Scripting (XSS) and terrible for injection of custom scripts (essentially XSS from yourself).

LinkedIn has a fairly strict CSP, which becomes a problem in two ways.

First is that UserScripts cannot execute within the page scope as they don’t have permission to run. This could be circumvented by using the different contexts offered by UserScript plugins; these would allow the scripts to run in isolation from the page JavaScript, and respect the CSP. Problem solved? Not quite.

Our second problem comes from Google Maps. The Google Maps API works by including their library with an appropriate key. On load, said script starts loading additional scripts which, in turn, add/remove various resources throughout its run.

Due to the heavily asynchronous processes of the API, these resources cannot be easily loaded in a UserScript requires or by GET requests, meaning we needed to figure out how to allow Google Maps without breaking CSP.

This spun-off a small, secondary side project, GoogleMapsEverywhereCSP; a Firefox Add-on to intercept and rewrite the CSP. A quick, MVP Add-on, it got the job done and allowed the project to move forward.

As we’re already modifying the CSP, and the scope of the UserScript + Google Maps would become an awkward mess, I opted to add the localhost in the CSP (not published). This allows me to serve the resources from the local machine in this early development stage.

Step 2: Creating Data Providers

This step is still ongoing.

Before consuming internal LinkedIn APIs, we needed to confirm what data is available and the form it takes. More detail can be found in the repo, but to give an overview:

Connection data can readily be gleaned from the search API, using the appropriate Foreign Keys/Entities to link different sets of objects together in the JSON.

Profile data (the user running the script) is a little more indirect. The JSON for this is dropped directly into the Profile template on page load, HTML encoded inside a code block.

However, overall the output structure is fairly similar in both Profile and Connection, and should make for a rather reusable model to align old structure with new.

Currently, this is the step being worked on.

Code available at: ConnectionGeographyJS