Kenny Bastani: graphs

About a year ago I read about Ray Kurzweil's "Pattern Recognition Theory of Mind", which he articulates in his book, "How to Create a Mind". I picked up the book after struggling with the idea of implementing a deep learning algorithm for parsing natural language text on Wikipedia. My goal was to discover links in volumes of text that were not already linked. I ended up developing all kinds of cool heuristics to do this, mostly by a lot of trial and error. King of these heuristics was a pretty simple algorithm at the core of the library that would find redundancies in batches of text content.

How this worked is if a phrase was mentioned repeatedly in a collection of about 50 sentences, then I could extract that phrase as a node and link it back to the pieces of content it belonged to. Every now and then you'll get a reference to another article's name, which can then be verified against Wikipedia's site index, which would provide more sentences to find repeated phrases within.

I struggled with persistency because I knew how ugly my problem was for a relational database. I created some entity-relationship models, and implemented them using Entity Framework over Microsoft SQL Server. It worked, kind of. I waited patiently to happen upon a better solution. Thankfully I did, and using a graph database I was able to take my cool little algorithms and solve my persistency problem at scale.

Kenny Bastani

Pages

Understanding How Neo4j Cypher Queries are Evaluated

Wednesday, July 9, 2014

Hierarchical Pattern Recognition

Tuesday, June 17, 2014

Blog Archive

Labels

Featured

Building Spring Cloud Microservices That Strangle Legacy Systems

Popular Posts