Acumen Indexing

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(Created page with "Acumen indexing is automated by a cron job scheduled under the "acumen" username, which runs a script in /home/acumen/bin called acumen.pl This script does the following thi...")
 
Line 9: Line 9:
 
# Pulls copies of the primary tables into the staging database
 
# Pulls copies of the primary tables into the staging database
 
# Imports that data into the staging version of solr
 
# Imports that data into the staging version of solr
# And removes older versions of the backed up databases and tables.  It also logs what it’s doing.
+
# And removes older versions of the backed up databases and tables (you'll see these in /home/acumen/sql/ -- should you need older ones, look in /srv/backups/ ).   
 +
#      It also logs what it’s doing. (You can see this in /home/acumen/acumen.log and acumen_cron.log)
  
 
It’s been working fine without modifications for almost a year now, and hasn’t needed any attention.
 
It’s been working fine without modifications for almost a year now, and hasn’t needed any attention.
  
What’s more likely to go wrong is that content will be mucked up that gets put into the web directories – and that has to be corrected.  Sometimes that means correcting what’s in the database as well as the web directories, and then reindexing.
+
What’s more likely to go wrong is that content will be mucked up that gets put into the web directories – and that has to be corrected.  Sometimes that means correcting what’s in the database as well as the web directories, and then reindexing.  You can always reindex by:
 +
 
 +
# Change directories to /home/acumen/bin
 +
# Become the acumen user. Type in:  `su acumen`
 +
# Run the script acumen.pl 
 +
 
 +
This uses a lot of CPU, so if possible, do this when few people are using the system.

Revision as of 13:30, 30 June 2017

Acumen indexing is automated by a cron job scheduled under the "acumen" username, which runs a script in /home/acumen/bin called acumen.pl

This script does the following things:

  1. Backs up the primary acumen database
  2. Switches out the index.php scripts in the primary web directory (/srv/www/htdocs/acumen-old/ ) so that the staging subdirectory becomes what is live online instead, and the staging database is the one referenced for delivery
  3. Performs a solr index on the acumen database (java -Xms1g -Xmx2g -jar /home/acumen/acumen.jar acumen >> /home/acumen/acumen.log) – (NOTE: I recently raised the amount of RAM allotted in this call, so you shouldn’t have to modify it again unless the quantity of content grows a great deal before you switch systems.)
  4. Dumps copies of the primary tables out of the newly indexed database
  5. Re-switches out the index.php scripts so that now the acumen-old directory is again what is live online
  6. Pulls copies of the primary tables into the staging database
  7. Imports that data into the staging version of solr
  8. And removes older versions of the backed up databases and tables (you'll see these in /home/acumen/sql/ -- should you need older ones, look in /srv/backups/ ).
  9. It also logs what it’s doing. (You can see this in /home/acumen/acumen.log and acumen_cron.log)

It’s been working fine without modifications for almost a year now, and hasn’t needed any attention.

What’s more likely to go wrong is that content will be mucked up that gets put into the web directories – and that has to be corrected. Sometimes that means correcting what’s in the database as well as the web directories, and then reindexing. You can always reindex by:

  1. Change directories to /home/acumen/bin
  2. Become the acumen user. Type in: `su acumen`
  3. Run the script acumen.pl

This uses a lot of CPU, so if possible, do this when few people are using the system.

Personal tools