Conference Training

Courses

Solr Unleashed

A Hands-On Workshop for Building Killer Search Apps

Is your Solr installation bullet-proof? Will it really scale the way you think it will? Is your relevancy not cutting it? Let the Solr experts show you the right way to implement all the platform capabilities you need to be using, but probably aren’t. You will walk away confident that your Solr installation is implemented in the best possible way—rock solid and scalable. This course is taught using Solr 4.x.

Who Should Attend?

This course is intended for Developers. System Administrators are welcome to attend, but it is primarily designed for people who have experience developing web applications in Java, PHP, Ruby or similar languages.

Course Overview

Having consulted with clients on Lucene and Solr for the better part of a decade, we’ve seen the same mistakes made over and over again: applications built on shaky foundations, stretched to the breaking point. In this two day class, learn from the experts about how to do it right and make sure your apps are rock solid, scalable, and produce relevant results.

Course Outline

The Fundamentals
  • About Solr
  • Installing and running Solr
  • Adding content to Solr
  • Reading a Solr XML response
  • Changing parameters in the URL
  • Using the browse interface
Searching
  • Sorting results
  • Query parsers
  • More queries
  • Hardwiring request parameters
  • Adding fields to default search
  • Faceting
  • Result grouping
Indexing
  • Adding your own content to Solr
  • Deleting data from Solr
  • Building a bookstore search
  • Adding book data
  • Exploring the book data
  • Dedupe updateprocessor
Updating your schema
  • Adding fields to the schema
  • Analyzing text
Relevance
  • Field weighting
  • Phrase queries
  • Function queries
  • Fuzzier search
  • Sounds-like
Extended features
  • More-like-this
  • Geospatial
  • Spell checking
  • Suggestions
  • Highlighting
  • Pseudo-fields
  • Pseudo-joins
  • Multilanguage
Multicore
  • Adding more kinds of data
SolrCloud
  • Introduction
  • How SolrCloud works
  • Commit strategies
  • ZooKeeper
  • Managing Solr config files

Having consulted with clients on Lucene and Solr for the better part of a decade, we’ve seen the same mistakes made over and over again: applications built on shaky foundations, stretched to the breaking point. In this two day class, learn from the experts about how to do it right and make sure your apps are rock solid, scalable, and produce relevant results.

Learning Objectives

This class is all about best practices. The end goal is for students to walk away confident that their Solr installation is implemented in the best possible way.

Prerequisites

This is a technical class for technical people. Experience with Solr is not required, but you should at minimum be comfortable with a command line (console, shell) to execute basic commands.

Introduction to Applied Natural Language Processing (NLP)

The automated processing of text data is now being successfully applied to many diverse types of mission-critical tasks in industries as varied as medicine, finance, law, advertising, engineering, and many others. The tutorial will cover the best-practices in many of them from the perspective of proven applications, methods, practices, tools and resources.

Course Overview

After attending this tutorial, participants will be able to build their own NLP systems for each of these topics by themselves and be able to achieve good baseline results in a short time.

Bio

Gabor Melli is the Chief Scientist at VigLink.com where he leads their initiatives to automate mission-critical semantic-rich processes. This work largely involves the training of predictive models for classification, sequence labeling, and estimation for tasks such as named entity recognition and disambiguation in user generated text using techniques and tools such as: CRFs, SVMs, HMMs, Logistic, LDA, NLTK, Python, R, Scala, Java; Hive, Hadoop, Cassandra, RedShift and AWS EC2/S3/EMR. He has led and delivered large-scale data-driven initiatives at organizations ranging from Microsoft, AT&T, T-Mobile, ICBC, Washington Mutual, and Wal*Mart to start-ups such as Datasage, Meals.com, PredictionWorks and now at VigLink.

Gabor holds a PhD in Computing Science from Simon Fraser University in the topic of document to ontology interlinking. He has been active in the data science community for over fifteen years and is the recipient ACM SIGKDD's Service Award in 2013. His current research interest include iterative semantic semi-supervised text analysis and automated business process optimization.

Citations