13 July 2010

Managing SharePoint SandBox (User) Solutions

In SharePoint 2010, Microsoft has provided new functionality called “Sandbox” or “User Solutions” that allow administrators and Site Collection owners to upload new functionality scoped to a single site collection. This means that site collection owners can host bespoke code or 3rd party add-ons that are useful to them without affecting others on the same farm. This functionality is incredibly useful in situations where organizations have shared SharePoint farms that serve the needs of various groups. It’s also very useful if organizations are looking to leverage services like SharePoint Online or Microsoft Business Productivity Online (both services offer a SaaS-based version of SharePoint), but have historically been prevented from including custom code.

In the May issue of SearchWinIT’s SharePoint E-Zine, I discuss the management controls and shortcomings of this new feature.

03 July 2010

Can Search really solve Information “Findability?”

We’ve all heard the lament from end users and IT professionals alike: “I can’t find what I need in our … [fill in the blank here – network share, intranet, portal, etc].”  The reason, virtually universally, is that: 1) there’s an unintelligible organization to the content and 2) using the internal search tool never seems to return the results people need/want.  As a result, there’s a growing sentiment that “the solution” is to just get a Google appliance (or other inappropriately expensive search tool) and everything will be fine.  It’s worthy of note that this sentiment comes from all levels in the organization – from executive business management to senior IT management and all the way down to the average joe/jane employee (IT or not) - making the statement carry that much more weight.

Interestingly, there’s a certain logic to this statement (perverse though it may be to people who spend their professional careers mired in trying to solve the problem of findability).  When most people search for content on the web, they overwhelmingly use Google’s search technology.  In my own experience, if the content exists, it’s likely I can find it with Google search.  How Google makes is happen is a mix of complex mathematics, massive computing power and (seemingly to me) black magic, since even companies with vast resources invested in bettering search (like Microsoft) have yet to equal Google’s search success.  However, isn’t Google’s success also partly due to perception and, frankly, low expectations?

By and large, research suggests that most people “hope” to find content when searching the web.  If I execute a query using Google’s search site, I don’t actually know if the content I want exists.  I hope that it does and, when I find it,  I’m pleasantly surprised.  By contrast, when I search for content internally, it’s not a question of “if” the content exist, it’s a question of where.  I know that the content lives somewhere inside my … [again fill in the blank – file share, intranet, portal, etc], I just don’t know where. 

When I use my portal’s search engine to execute a query, I expect to find the content I want.  However, when the content I know exists doesn’t appear in the search results (or doesn’t seem to appear), I get very frustrated (can I get a Marine hoo-rah if you empathize??).  If I’m feeling particularly persistent, I may execute another query, perhaps using different keywords, to see if I can find my content in a different way.  This second attempt may or may not yield better results.  In the end though, my perception is I’ve just wasted a bunch of time and I’m no better off.  Worse yet, I have now come to two very harsh, though potentially inaccurate, conclusions: 1) Google rocks and I don’t understand why we don’t use it internally and 2) whatever search technology we are using internally stinks (I’ve heard clients use more colorful language, which I’ll spare you here).  Also, and not any less important, I’m very deeply disappointed that I will be spending more wasted energy duplicating an effort to either recreate the content I can’t find or, at least, doing duplicative research to support the work I actually have to complete.

If you’re still reading, I suspect you not only empathize, but are hoping I’ll give you “the solution.”  My honest response is that if I had a silver bullet solution, I would probably be very wealthy.  I would be cruising the Atlantic on a large yacht, with a drink in my hand and not searching for anything in particular except perhaps the elusive lime to add a little something extra to my libation.  Since my reality is that there is no silver bullet solution (and I don’t have a boat), I offer some truisms and suggestions about finding content in your organization and how to make that process better:

  • Taxonomy is required (really) 
    If you want people to find content, in any repository, you must (I don’t use this work lightly) apply some sort of organizing  principle to that content (“organizing principle” is synonymous with taxonomy).  You don’t have to look much further than your physical file rooms (organized alphabetically), biology (order, genus, species, etc – you remember high school biology no?) or virtually any other repository of stuff to figure out that if it’s not organized you’ll have a slim chance of finding what you need.  To provide a very concrete example, look at Best Buy’s navigation for a technology category approach that helps you find the product you want; if you can’t find what you want, let it stand as an example of how poor taxonomy can confound someone.  That said, I recently had a CIO tell me that he didn’t “get” the value of taxonomy and that we all live in a “search-based world.”  Again, though I respect this particular CIO and I get the sentiment, it’s just not reality.  You must develop a way to organize your content and make that organization (taxonomy) infinitely clear to the people for whom it will be the path to finding content nirvana.   Frankly, not everyone will use search to find content and a poor taxonomy will impact your search tool’s ability to return relevant results.
  • Google is good, but it’s not likely to work internally
    It’s true search technology has improved tremendously.  Google is clearly the leader.  However, their success is not actually magic and probably won’t translate to internal search success; public search and “enterprise” search are different beasts; Google is very good at public search, but haven’t fared any better than any other search company within the enterprise.  Also keep in mind that, Google’s dominance in general public web search stems from a number of factors, most of which do not exist inside of any singular company (a few of those key dimensions are listed here):
    • An overwhelming volume of content that can imply topic relevance through cross-linking (keywords in one document and/or site that are linked to other content and/or sites that imply authority for that keyword)
    • Massive computing power that can be used to execute complex algorithms (their specific “secret sauce”)
    • Search analytics that help create patterns of relevancy based on actual user click behavior (result select or re-query)
    • Legions of Google staffers that constantly improve Google’s algorithms over time (based on content data and analytics).  Despite what you may have heard about what Google does for a living, it’s all about search (not phones or collaboration software).  No other part of their business comes close to generating the amount of revenue (obviously through ads shown with search results) or has the same amount of resources devoted to it.
    • Like Microsoft had “help” from IBM becoming a dominant player in the PC operating system business, Google had a similar giant help them early on – Yahoo.  When Yahoo was the most popular site on the web, guess what search technology they used?  That deal made it much easier for Google to supplant their patron, in later years, as the leader in search.
  • Search is not the exclusive answer to findability
    Even if someone could create the perfect search engine that returned very precise and relevant results, why do you want to force your user community to execute a search just to find content?  Doesn’t the mere act of search introduce yet another set of clicks and keystrokes?  Isn’t the goal to “get there in less than two clicks” and “provide answers quickly?”  Think about your own experiences with search.  How long does it take to formulate the appropriate query?  To review the results?  To find the right result, click on it and validate it’s actually what you want?  Wouldn’t it be easier to create a very clear path to the right content?  Better yet, don’t you already know a good deal about your employees and couldn’t you predict at least some of their content needs?  If so, why not just surface that content on the home page of your portal (or other appropriate location)?  Better yet, wrap those items in an RSS feed and deliver them through a desktop gadget (works on both MACs and PCs).  Do you think I’m crazy?  This technology surely couldn’t exist you say (or it’s too complicated to implement).  Have you ever visited your iGoogle page?  Have you ever ordered anything from Amazon?  These sites aren’t doing anything that revolutionary in the year 2010.  They’re simply using data they already know about you to automatically find what might be relevant (or asking you to choose topics for the taxonomy that you are interested in viewing).  Is it perfect?  No.  Does it save time in the long run?  Absolutely.  Does it provide you a somewhat “searchless” content findability experience?  You bet.  Could you still supplement this approach with a search tool?  Of course.
  • Successful search experiences require constant work
    Just because you spun up a Google appliance, SharePoint search (or FAST) or implemented Autonomy, you’ve only just begun.  Like anything else in this world, success comes from discipline, dedication and an ongoing effort to ensure success.  Search is no different.  More than one of our clients has called and asked for our help in improving their search experience.  Unfortunately, virtually all of them have done next to nothing, beyond installing the software, to make search work properly.  Whether it’s not excluding common navigation items on a web site (the same navigation shown on every page messes with relevancy for those keywords), avoiding the application of metadata on documents (like no title or descriptions – not even counting the “fancy” stuff like author, department, main topic area) or leaving patently irrelevant content in the repository past it’s expiration (like the snow day policy from 2001 that for some crazy reason is the most relevant document when using “snow policy” as a query), clients frequently thwart their own best intentions.  To make search work, you must do a bare minimum, like excluding content that should never be a result (like a home page), apply metadata with appropriate values, perform periodic “gut check” searches that should yield consistently relevant results and monitor the behavior of your users by actually looking at analytic data (you’d be surprised how many queries return zero results or where no one clicks on a result).  If you just do this minimum, your search experience will dramatically improve.

If I’ve prattled on too long in this post and you’ve missed my point about findability, here’s the short version: search is not the exclusive answer to finding content; you must start with a taxonomy that everyone understands (and fits your content), implement search properly and monitor progress (making changes when necessary).  Anyone who says differently is either selling a search tool or has been sold a bridge in Brooklyn.