Search Catalog


  1. Constructing a query
  2. Understanding the Search Progress screen
  3. Viewing the Output of Search Results
  4. Controlling who can search through resources
  5. Search engine performance measurements
  6. Notes on software architecture
  7. Limitations

1. Constructing a query


Queries are constructed with the three logical operators: AND, OR, and NOT. These logical operators combined with search terms creates a logical expression (Boolean query). Search terms can either be alphanumeric words, or phrases delimited by quotation marks. Logical operations can be grouped together with parentheses.

Examples include:

The search is case insensitive. (Logical operators are also evaluated in a case insensitive manner, e.g. and=AND.)


 

2. Understanding the Search Progress screen


The Search Progress screen provides five pieces of information. This information is dynamically updated every second as the search progresses across the LON-CAPA network.

  1. The number of library servers being scanned.
  2. The number of database hits found.
  3. The time elapsed (in seconds).
  4. A grid showing the response status of every LON-CAPA library server.
  5. A window for displaying response details of individual LON-CAPA library servers.

The response status grid consists of the following symbols:


 

3. Viewing the Output of Search Results


The interface provides four different ways to format the output of metadata information.


 

4. Controlling who can search through resources


Currently, any user can see metadata for any published resource. We are working to change this and are considering two possibilities:

  1. Browsing and searching should only be
     
    * either user specific (georgio can only browse and search
       /res/DOMAIN/georgio)
    * or has advanced status as indicated by $env{'user.adv'}
    
  2. If user can access resource through current role (student in a
    class, etc) then it should show up on searching and browsing.
    Even if resource conditionals prevent actually viewing
    the specific resource.  Advanced users can search and browse
    "everywhere".
    

 

5. Search engine performance measurements



 

6. Notes on software architecture


LON-CAPA is meant to distribute A LOT of educational content to A LOT of people. It is ineffective to directly rely on contents within the ext2 filesystem to be speedily scanned for on-the-fly searches of content descriptions. (Simply put, it takes a cumbersome amount of time to open, read, analyze, and close thousands of files.)

The solution is to hash-index various data fields that are descriptive of the educational resources on a LON-CAPA server machine. Descriptive data fields are referred to as "metadata". The question then arises as to how this metadata is handled in terms of the rest of the LON-CAPA network without burdening client and daemon processes. I now answer this question in the format of Problem and Solution below.

PROBLEM SITUATION:

  If Server A wants data from Server B, Server A uses a lonc process to
  send a database command to a Server B lond process.
    lonc= loncapa client process    A-lonc= a lonc process on Server A
    lond= loncapa daemon process

                 database command
    A-lonc  --------TCP/IP----------------> B-lond

  The problem emerges that A-lonc and B-lond are kept waiting for the
  MySQL server to "do its stuff", or in other words, perform the conceivably
  sophisticated, data-intensive, time-sucking database transaction.  By tying
  up a lonc and lond process, this significantly cripples the capabilities
  of LON-CAPA servers. 

  While commercial databases have a variety of features that ATTEMPT to
  deal with this, freeware databases are still experimenting and exploring
  with different schemes with varying degrees of performance stability.

THE SOLUTION:

  A separate daemon process was created that B-lond works with to
  handle database requests.  This daemon process is called "lonsql".

  So,
                database command
  A-lonc  ---------TCP/IP-----------------> B-lond =====> B-lonsql
         <---------------------------------/                |
           "ok, I'll get back to you..."                    |
                                                            |
                                                            /
  A-lond  <-------------------------------  B-lonc   <======
           "Guess what? I have the result!"

  Of course, depending on success or failure, the messages may vary,
  but the principle remains the same where a separate pool of children
  processes (lonsql's) handle the MySQL database manipulations.


 

7. Limitations


The metadata search can only consist of spaces and alphanumeric characters. Other characters are illegal and are filtered out when sending the search request to the search engine.

LON-CAPA library servers are given 9 seconds to inform another server that they are in the process of generating a reply to a search request. Note that this is DIFFERENT than actually conducting the search. Upon initial communication, the individual library servers just send a response key to indicate the name of the results file that is going to be generated.

LON-CAPA library servers will only send up to 100 records in response to a search.

The output of matching records is limited to 200 records.

The capping of results to values of 100 and 200 should eventually be user modifiable. These limitations exist to avoid processing overly expansive search requests.