Archive for the ‘Userful Information’ Category

Backup Specify 6 using Windows Scheduler (Part 1)

Friday, January 27th, 2012

Sometimes it is nice to know your Scientific Collection is safe. And yes Specify asks you to back up your system but wouldn’t be great if you can have the machine do it automatically, say every Friday night. Well here are a few tips to get a batch in place that does just that. This guide is for Windows only.

Step 1 is to make a folder (example: c:\_specify_backup\) on your same computer. This has to be on the same computer that is running the mysql specify server. We are assuming this machine is running Windows.

Step 2 you will need to make a “specify_backup.bat” in that folder and edit it using notepad. I will use { } to fill in your own information here. C:\_specify_backup can be changed to any folder name you made in step 1. Now paste and modify the code as needed to match your connection information. Note that this is not your Specify connection but your mysql db connection information.

mysqldump {specify_database_you_want_to_backup} -u {db_user_with_permission} -p{user_pass} > C:\_specify_backup\specify6_{collectioncode}_%date:~-4%%date:~-7,2%%date:~-7,2%.sql
c:\_specify_backup\gzip.exe C:\_specify_backup\specify6_{collectioncode}_%date:~-4%%date:~-7,2%%date:~-7,2%.sql

Step 3 is to go to the GZip website (http://www.gzip.org/#exe) and download http://www.gzip.org/gzip124xN.zip and move the gzip.exe into the same specify backup folder. So now you should have the specify_backup.bat and gzip.exe files in your folder.

Step 4, lets test that it is working by going to Dos and running the .bat. To do this you can go to Start > Run and type “cmd” and a black dos window should open. You will need to type:

cd C:\_specify_backup\
specify_backup.bat

If all the connection information was correct you should see a specify_{collectioncode}_year_month_day.gz file. This is a compressed zip file containing your database.

Step 5 go ahead and delete this file using “My Computer” or any other way you normally would know how to delete a file. In dos it is “rm {filename}”.

Step 6 go to Start and search for “Scheduler” and you should see Task Scheduler. Click that and go to the next step.

Step 7 you need to click the “Task Schedule Library” then click “Create Task…”

Now fill in the information:
General > Name: Specify 6 Backup
General > Description: Creates weekly backup for Specify 6 databases.
Check “Run whether user is logged in or not”
Check “Run with highest privileges”
Triggers > New… >
Check: Weekly, Friday and change the time to 22:00 (to run at 10pm every Friday)

Actions > New… > Program Script: C:\_specify_backup\specify_backup.bat
Click ok and you should see a new task in your “Task Schedule”.

If you did everything correctly you should see a backup of you database right after 10pm every Friday. Since each file keeps the date in the name you will be able to keep all the weekly backups. If you see that your file is 0 in filesize then something probably did not work correctly with the connection.

In “Part 2” to of this blog we will address how to add another Scheduled Task to send these backups to an online FTP location for offsite storage. QuickTip: If you use Dropbox (https://www.dropbox.com/) you can change the mysqldump path to point to a dropbox folder and it will automatically get synchronized and backed up using the Dropbox service. Note: If you make that folder public then anyone will be able to download your Specify backups which may or may not be a good thing. Also note that you are storing a user/password to connect to mysql on your machine which may produce some risks. It might be advised to consult with your IT department to have a special user that restricts this connection only to your collection db only.

Any questions or comments feel free to send me an email at mikegiddens (at) silverbiology.com

Amazon EC2 + Tesseract + OCR = Thank You!

Thursday, March 10th, 2011

This day I took on the challenge of setting up a Amazon Micro (Free Tier) machine to run a simple web service for OCR using Tesseract (http://code.google.com/p/tesseract-ocr/).

There is a default web service setup with params:

img: uri to the jpg image that you want to transform
callback:
used for crossdomain services
format: deafults to json but you can also supply txt to get just raw text

To build a machine you can follow these steps or if you are interested in the image let us know and I will contact you with more information.

(more…)

What is a Darwin Core Archive and who uses it?

Thursday, November 18th, 2010

In our upcoming release of SilverCollection v1.1 we are moving to support the loading and sharing of Darwin Core Archives.  For those that are not familiar with this we have put together a short video explaining about what they are, who uses them, and how to build them.  Any questions please contact us. Enjoy!

Where is my collection located?

Tuesday, January 6th, 2009

There has been some recent discussion on associating the geographical coordinate of a collection or museum.  If you are interested in updating your information in the Biodiversity Collection Index and not sure how to go about finding your coordinate you can follow this simple tutorial to get your location without any GPS device. (more…)

Visualization of the Biodiversity Collection Index

Sunday, January 4th, 2009

Ever wanted to know where all the historical plants are stored? How about bugs, insects, spiders, butterflies, or fish? Well with the help of Biodiversity Collection Index (BCI) and their wonderful resources of information we are finally able to get a true interactive visualization of how our world is collected and where that information is housed.

Visit the Interactive Map (http://labs.silverbiology.com/biocol)

Research into biodiversity relies on the use of specimens. These specimens are held in reference collections around the world. BCI is a central index to these collections. With the help of BCI’s Web Services, SilverBiology was able to use its new open source web tool SilverMapper to easily map the location of these collections.

This data source is directly based on the data from the Biodiversity Collection Index (BCI) and all geospatial data is estimated on Google Maps reverse geocoding service to establish a latitude and longitude.

Why did we do this?

We wanted to show a real world example of our new program SilverMapper and at the same time provide something useful for the community. Comments and suggestions are welcome.  I would also like to thank Roger Hyam for all his hard work over at the Royal Botanic Garden in Edinburgh for really bringing the Biodiversity Collection Index together.  I hope this little demonstration will encourage collection managers to update their information with the exact latitude and longitude position at BCI to help provide the precise location of where people can find their collection.

List of Nomenclators

Monday, March 17th, 2008
  • ICTV
  • IF
  • ICSP
  • Bergey’s Manual
  • INA
  • ZR
  • uBIO
  • CoLp
  • IPNI

(more…)

Useful Terms

Friday, February 15th, 2008
ABCD Access to Biological Collection Data. See http://www.bgbm.org/TDWG/CODATA/Schema/default.htm.
Access Point The URL (web address) of a Web Service.
Backus-Naur Form a metasyntax used to express context-free grammars. See http://en.wikipedia.org/wiki/Backus-Naur_form.
BioCASe Biological Collections Access Service. See http://www.biocase.org.
CNS Concept Name Server. A service to get information about existing conceptual schemas and their concepts.
Concept Definition of a property, class or relationship.
Conceptual schema A formal definition of concepts. It can also be seen as a data model or ontology.
Data source The term used in the BioCASE project for an access point.
Dublin Core Dublin Core Metadata Initiative. See http://dublincore.org
DiGIR Distributed Generic Information Retrieval. See http://digir.net.
Federation schema A conceptual schema adopted by a federation.
GBIF Global Biodiversity Information facility. See http://www.gbif.org.
GET HTTP communication method where form data are encoded as parameters in an extension to a URL. The GET method is principally used to transmit requests for data to a web server (e.g., a simple database search).
HTML Hypertext Markup Language. A subset of Standard Generalised Markup Language (SGML), used for authoring pages for the World Wide Web.
HTTP Hypertext Transfer protocol, the commonly used protocol for transmitting requests and documents between applications on the World Wide Web.
KVP Key-Value Pair. One of the possible encodings for TAPIR requests.
OGC Open Geospatial Consortium. See http://www.opengeospatial.org.
OMG Object Management Group. See http://www.omg.org/.
NCD Natural Collections Descriptions. A TDWG emerging standard for describing collections of natural history material. See http://www.tdwg.org/NCD/TDWG_NCD_Subgroup.htm.
normative Referring to a standard or set of norms that are understood to be correct. A normative document is one which describes how things ought to be and why.
Output Model An XML schema language (or potentially other) formatted response structure.
POST POST is an HTTP communication method that can include any kind of data or command. The data are encoded separately and do not form part of the URL as in a GET message so this method is better for complex, sensitive, lengthy or non-ascii data.
protocol An agreed format for transmitting data between two or more devices.
Provider Originally defined as an organisation hosting either a DiGIR or a BioCASe service. In the context of TAPIR, an organisation hosting a TAPIR access point, which may point to several data sources.
Provider software Software running on a web server that facilitates access to data.
RDF Schema A language for describing vocabularies in the Resource Description Framework (RDF). See http://www.w3.org/TR/rdf-schema/.
SDD Structure of Descriptive Data. A TDWG, XML-based interoperability standard for descriptive data.
SOAP Simple Object Access Protocol, an XML-based messaging protocol used for invoking web services and exchanging structured data.
TAPIR TDWG Access Protocol for Information Retrieval.
TCS Taxonomic Concept Transfer Schema. An XML schema for the exchange of taxon concepts. See http://tdwg.napier.ac.uk.
TDWG Taxonomic Databases Working Group. See http://www.tdwg.org/.
TSA The Species Analyst, a research project developing standards and software tools for sharing biodiversity information. See http://speciesanalyst.net/.
UDDI Universal Description, Discovery and Integration. UDDI is a specification for maintaining standardised directories of information about web services.
URL Uniform Resource Locator. The address of a resource on the Internet
URI Uniform Resource Identifier. A formatted string that serves as an identifier for a resource, typically, but not exclusively, on the Internet. URIs are used in HTML hyperlinks.
W3C World Wide Web consortium. See http://www.w3c.org.
Web Service A service based on Internet Protocols, such as HTTP, SMTP or FTP, and also based on XML.
WFS Web Feature Services. An Open Geospatial Consortium XML-based standard to enable transfer of geographic feature data using Geography Markup Language (GML). See http://schemas.opengis.net/wfs/.
wrapper Software that allows standardised queries to be run against an underlying database.
WSDL Web Services Description Language. An XML format for describing Web Services as a set of end points operating on messages containing either document-oriented or procedure-oriented information. WSDL is the language used by UDDI.
XMI XML Metadata Interchange (XMI) is an OMG standard for exchanging metadata information via XML. See http://www.omg.org/technology/documents/formal/xmi.htm.
XML Extensible Markup Language developed by the W3C. A means of tagging data for transmission, validation and manipulation. See http://www.w3.org/XML and http://www.w3.org/TR/REC-xml.
XML Schema A formal definition of the required and optional structure and content of XML formatted documents within its domain. See http://www.w3.org/XML/Schema.
XPath Defines a way of locating and processing items in XML documents by using an addressing syntax based on the path through the documents logical tree structure. See http://www3.org/TR/xpath.
XQuery XML Query Language. A W3C specification for querying XML formatted data.