Updated Chemistry Web Services – now with Density

I mentioned a while back the web services that Rajarshi Guha had set up for us. We are often in need of molecular weight and density data for both solutes and solvents since we rely on an assumption of volume additivity when calculating concentration.

Since Rajarshi moved to the NIH, the location of the services has changed. We now have the CDK installed on a Drexel server so some of the simple services like MW and SMILES generation are still available there.

However density has been challenging to provide as a service. Experimental density values for solvents are commonly available but the calculated densities of solids is hard to find. ChemSpider is one of the few sources where calculated densities of solids and liquids are freely available. Unfortunately there are currently no ChemSpider density web services.

As an interim solution for the UsefulChem and ONSChallenge projects we have set a look-up table as a Google Spreadsheet (SolventLookUp) for most solvents of potential interest. Solutes added to our SolubilitiesSum sheet are automatically added to a SoluteLookUp SQL database running at Oral Robert University and the ChemSpider densities are added there via an automated but slow process.

Andrew Lang has used these resources to provide web services returning densities and other properties or descriptors. These data sources are especially important for the nearly automated production of new editions of the ONS Challenge Solubility Book. This is not a general solution since it only includes compounds of interest to our group and would not scale (at least for licensing reasons) to millions of compounds.

But it does come in handy for us because we can quickly call these services within a Google Spreadsheet to do a variety of useful calculations, minimizing the possibility of error by copy and pasting.

As an example see the following ChemServices sheet. Enter the common name for a solvent or solute and the number of millimoles and the sheet will automatically calculate the corresponding number of milligrams or microliters. [Note that Google Spreadsheets can only handle a maximum of 50 web service calls at a time - a useful trick is to highlight cells after the calculations then copy and "paste as values". Make sure to keep some cells with the web service calls in case you need to do more calculations in the future]

Nature Precedings as an Archiving Tool for ONS Solubility Book

The issue of archiving and citation is a topic that is usually raised whenever I give a talk about Open Notebook Science. We have recently tried to address this using several complementary strategies.

The publication of a book containing a snapshot of all the values obtained from the Open Notebook Science Solubility Challenge has turned out to be a convenient mechanism. By using LuLu, the book can be either downloaded for free as a PDF or ordered as a physical copy for just the printing and shipping charges.

However, Lulu does not have a convenient method of keeping track of different editions of the book and it is unclear how to best cite them.

Nature Precedings solves both of these problems quite nicely. I have uploaded the PDF of each book edition to NP and the versions are automatically linked to each other. In fact if you try to access an older edition, NP pops up a warning that a more recent version is available with the corresponding link (see image below).

Precedings also provides information about how to cite the document, including a DOI for each version. Unfortunately it appears that it can take some time for the DOIs to resolve. Links to different versions can also be formatted like this:

http://precedings.nature.com/documents/4243/version/1
http://precedings.nature.com/documents/4243/version/2
http://precedings.nature.com/documents/4243/version/3

Links to the Lulu version of each book are also provided, which is convenient for anyone who might want to order a physical copy.

At this time Precedings does not accept zip files containing the full archive of the source files for each book version – although a link to the archive is provided in the preface of the book. We have found that our library’s DSpace repository is a convenient location for these.

Robert Grubbs Webinar on March 2, 2010

Honeywell Nobel Interactive Studio will host an interactive seminar with Robert Grubbs at 11:00 ET on March 2, 2010. Sign up here. Questions can be submitted via email, Twitter, Facebook or Orkut.

2005 Nobel Laureate in Chemistry, Robert Grubbs, will discuss how the availability of a catalyst that promotes scrambling of the fragments of a carbon-carbon double bond by a metathesis reaction has led to a variety of commercial applications including the production of tough polymers and highly functionalized pharmaceuticals.

Science Commons Symposium Thoughts

The Science Commons Symposium held at the Microsoft Campus in Redmond on Feb 20, 2010 turned out to be the best conference I have attended in the past year. Hope Leman and Lisa Green did a fantastic job of lining up an electric group of speakers and making sure that everything ran smoothly. Chris Pirillo provided streaming video of the talks and the liveblogging on FriendFeed and Twitter was pretty active. The recordings will be made available shortly.

It was utterly captivating from start to finish. Cameron Neylon started us off with “Science in the Open: Why do we need it? How do we do it?” by outlining the tremendous opportunities of doing science more openly while remaining aware of the obstacles. I followed up with a specific Open Science implementation “Using Free Hosted Web2.0 Tools for Open Notebook Science“, including the recent work I did with Andrew Lang on creating snapshot archives of a notebook with source files.

Antony Williams followed with “ChemSpider: Collecting and Curating the World’s Chemistry with the Community“, convincingly demonstrating the power of crowdsourcing to curate Open Data. Peter Murray-Rust then covered “Open Data and how to achieve it“, pointing out the role of an embargo period in getting people to start to participate in exposing data. All of these presentations made the symposium fairly chemistry centric but I don’t think the audience minded – and there were a few chemists in the audience.

After lunch Heather Joseph from SPARC talked about “Is Open Access the “New Normal”?“. Her views were about the role of policy change to support OA, for example how NIH funded work is required to be OA within 12 months of publication. Stephen Friend blew a lot of minds with his talk on “Setting Expectations: Need for Distributed Tasks and Evolving Disease Models“. I’m not quite sure I completely get his network approach compared to our current disease models of targeting a specific receptor but I am sure I’ll come across it again since it depends on the processing of (vast amounts of) Open Data.

Peter Binfield proudly recounted the achievements of PLoS ONE, of course including the article-level metrics: “PLoS ONE and article-level metrics – A case study in the Open Access publication of scholarly journals“. I didn’t agree with his call for converting all the metrics to a single number for academic performance reporting – but that did lead to a vigorous discussion on FriendFeed.

Finally John Wilbanks from Science Commons delivered the keynote. It was a mesmerizing overview of what is needed to make Open Science more productive and the importance of working at the bottleneck. He described the elegant way in which the CC0 license allows for a very simple way of making data available as if it were public domain, regardless of the laws in various countries. He also showed his current work on trying to make automatic licenses for processes under patent protection and material transfer agreements.

Brian Glanz has provided a detailed summary of all the sessions, including a wealth of links to slides and additional information.

My slides:

#scspn

Support Open Data by endorsing the Panton Principles

If you care about Open Data take a few seconds today to endorse the Panton Principles. There are also logos there to label your work as Open. Some fit nicely in the navigation bar of a wiki.

Science is based on building on, reusing and openly criticising the published body of scientific knowledge.
For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open.

ONS Solubility Book: Edition 3 with Notebook Archive

Edition 3 (2010-02-11) of the ONS Solubility Challenge book is now available.

We’ve been trying for some time to find a way to conveniently take a snapshot of our Open Notebooks and all associated raw data files. This could serve as a way to back up all of our work as well as provide a means of finding out the state of knowledge for a project at a given moment in time. There is also a tremendous benefit to confidently using the best of free hosted Web2.0 services out there (e.g. GoogleDocs and Wikispaces) without being concerned with changes in policies or access down the road.

Our recent use of the ONS Challenge Solubility book to periodically create releases of summarized data has opened up a convenient opportunity. And yesterday the last piece of the puzzle fell into place. Through a combination of fairly quick manual and automated tasks, Andrew Lang and I are able to push out a full snapshot of all relevant files and lab notebook pages and associate it with an edition of the book.

As described below, the archive is accessible interactively on a server, as a zip download or as a CD from LuLu. Perhaps we can also find a home on library servers in the future.

More details are provided in the preface for Edition 3 (2010-02-11):

This is the first edition to include a full archive of the ONS Challenge notebook. A space export from Wikispaces provides an initial version of all the HTML pages in the notebook with local hyperlinks to copies of all images and files uploaded onto the wiki. All of the Google Spreadsheets are automatically downloaded as Excel spreadsheets and placed in the same “files” folder as the images. NMR spectra, stored as JCAMP-DX files, are placed in the “spectra” folder. All of the HTML pages are reformatted to provide local references to both Excel spreadsheets and the JCAMP-DX files.

The notebook archive is meant to represent a snapshot of the state of all source documents at the time of the publication of an edition of this book. When used from a server with web services running, clicking on links to the spectra will allow interaction via a browser interface, including zooming in or out and integration of the NMR spectrum. When accessed in stand-alone mode after downloading or directly from a CD, everything will work the same, except that JCAMP-DX files must be open from JSpecView running on the desktop. Excel files will retain any calculations in the cells of the original Google Spreadsheets but dynamic values generated from calling web services – such the script that automatically integrates NMR spectra – will be frozen as simple values. However the link to the web service used will be stored in the cell as a comment. Links to external websites are not crawled and embedded Google Spreadsheets or videos are not copied. These will work but will reflect live data on the web.

The February 11, 2010 version of the notebook archive is available on a hosted site, on a CD or by download.

Funding Agencies and Open Science

I’ve been invited to participate in a panel discussion on “New tools in research, teaching, and publishing” on May 24, 2010 at the annual PI meeting for the Integrative Graduate Education and Research Traineeship (IGERT) program at NSF. After speaking with program manager Vikram Jaswal, I feel encouraged that funding agencies are interested in exploring the emerging role of Open Science and related novel communication channels for facilitating scientific progress.


The role that funding agencies can play in Open Science has been the subject of some discussion in the blogosphere. One view is that they can require more openness as a condition of funding. The NIH’s requirement to make papers resulting from funding Open Access after 12 months of publication is a step in that direction. There is a debate about whether this should be extended to Open Data – even to the point of Open Notebook Science, where even failed experiments would be shared for the scientific community to learn from.

I tend to prefer the carrot to the stick. I think that funding agencies could value plans for “sharing beyond the norms” in proposals without imposing strict requirements. In the long run OS will succeed because each stakeholder (researcher, funder, publisher, etc.) acts out of selfish motives. I believe that the most effective way to stimulate this selfishness is to show concrete examples of practice and benefits.

Funding agencies should see the benefits of OS as a higher ROI – in terms of knowledge gained and shared with the scientific community – as well as the wider population ultimately footing the bill. A perceived downside of higher transparency might be the greater difficulty in fueling hype cycles. Most things aren’t as pretty up close and science is no exception. If you measure success as the absence of failure and ambiguity then increased transparency is going to be a problem. Most experiments are failures of some sort (as the saying goes – if you’re not failing you’re not trying hard enough). But failed or successful – both categories of results can be useful to others if they are made available in a way that they can be discovered easily. Funding agencies can help transparency by making it clear that the whole truth is more valuable than a subset of the truth presented in a way that might be conveniently misleading.

This doesn’t mean that you can’t put your best foot forward and give a slick PowerPoint presentation to guide your audience. It is ok to construct an easily digestible narrative of your research. It is ok to distill your work down to key conclusions. It isn’t necessary to confuse your audience with every ambiguous result and unanswered question.

But – in addition to the streamlined version of your work – if you provide all the details of the failures and ambiguities for those who can benefit from further exploration of what you have done – there is a great potential for accelerating the scientific process. For a funding agency OS can mean a bigger bang for the buck.

ONS talk at UPenn Library

On January 21, 2010 I presented a talk at the van Pelt Library at the University of Pennsylvania about “Open Notebook Science and other Science2.0 Approaches to Communicate Research“. I had a very interesting chat with some of the folks there who work with my host Shawn Martin.

The role of librarians is certainly changing. When I was in grad school the main interaction I remember is our librarian doing STN searches. Electronic databases were new and very expensive at that time and the librarian’s skill was required to query the database in an efficient manner to minimize cost. Students were not allowed direct access.

I remember when I was allowed to ask for a substructure search of cyclobutene systems for my thesis – it felt magical and the results were like gold. There was no way for me to do this using index books. Now this type of search is so routine that students usually look bored doing it for their projects.

The issues for librarians now are completely different. The definition and meaning of scholarship is shifting. Instead of scarce resources, there is an abundance of tools, content and social networks out there – much of it free. Openness in all of its forms is becoming possible and is forcing people to take a position.

I think Shawn’s approach makes a lot of sense. He tries to provide options for his faculty and students to choose from without imposing a particular philosophy. My talk was just another example to present.

[My only regret is that I used a new computer without checking the audio settings first and it was set a little too high so the audio quality isn't perfect. Hopefully it is still intelligible for the most part.]

Science Online 2010 Thoughts

This year’s Science Online conference at the Research Triangle Park, NC was a satisfying experience.

I had the pleasure of meeting face to face for the first time people I have gotten to know quite well over the blogosphere: Steve Koch, Hope Leman, Walter Jessen, Pawel Szcz?sny and Andy Farke. This is probably the best conference for me to catch up with friends and collaborators – Bill Hooker, Tony Williams, Cameron Neylon, Deepak Singh, Carmen Drahl, Dorothea Salo, Christina Pikas and several others.

My session on Second Life Saturday didn’t work out so well. We had major connectivity problems both at the conference side (bandwidth maxing out and even the router getting unplugged at one point!) and on Second Life. We spent quite a bit of time before the session trying to get things under control but SL voice failed for everyone there after working briefly. I also got kicked out repeatedly and had trouble teleporting. I did manage to follow Max Chatnoir to her always impressive Genome island but only saw her type a few lines of chat.

That was very disappointing and I’m not sure I’ll attempt another live demo like this again. After so many years in operation the Second Life servers really should be reasonable stable, given the annual fee we pay for our islands. I think a better use of the technology might be a parallel but separate track only on Second Life, where some of the presenter can display their posters for several days and visitors can leave comments or arrange to meet at certain times. This is what Andy Lang and I did for ACS island a while back and it worked fairly well.

The session on Open Notebook Science I co-chaired with Cameron Neylon and Steve Koch on Sunday went a lot better. I provided a context by demonstrating the utility of ONS in resolving the NaH oxidation controversy followed by the example of the aqueous solubility of EGCG, where the lack of access to raw data in the literature and company catalog leads to an necessarily confusing situation. At the end I mentioned the case where simply reading the lab notebook of Alexander Graham Bell exposed a scandal detailed in Seth Shulman’s new book “The Telephone Gambit“.

Within that framework I provided an overview of the ONSChallenge and the Wikispaces/Google Spreadsheet/Blogger system we use in my lab. Cameron then spoke a bit about the LaBLog system he uses and the broader scope of incorporating automation in the creation of the notebook records. Finally Steve reflected on his experience with OpenWetWare in both a teaching lab and his research group. He displayed some positive comments he received about ONS in a recent grant application. The discussion afterward moved into the challenge of archiving large amounts of data. I mentioned that we are still looking for a library partner for our ONSarchive project.

On Saturday night during dinner there was an “Ignite” style session where speakers are given about 5 mins to go through their slides, which change automatically every 20 seconds. I presented with Tony on Games in Chemistry. It turned out to be an eclectic collection of talks and are worth a watch when they are posted.

I enjoyed Jonathan Eisen’s session on Open Access and Peter Binfield’s on PLoS ONE article level metrics. I learned that the DOI must be used for the blog citation metric to work properly and that all the statistics can be downloaded as an Excel file. The scientific world operate much more smoothly if the mainstream adopted a fraction of the philosophies espoused in these sessions.

My favorite session was Andy Farke’s demo on the Open Dinosaur Project to crowdsource the measurement of bones. It was exciting to see that his data management system using Google Spreadsheets is similar to our ONS Solubility Challenge. It is possible that he could use the code that Andy Lang wrote to activate bots to flag discrepancies and perhaps semi-automatically publish a book with a summary of the results in a similar way that we do. Instead of pictures of molecules his entries would have images of dinosaurs. We’ll follow up to see what is feasible.

#scio10

Ugi Reaction as Mettler-Toledo Application Note

Our JoVE paper Optimization of the Ugi Reaction Using Parallel Synthesis and Automated Liquid Handling (Jean-Claude Bradley, Khalid Baig Mirza, Tom Osborne, Antony Williams, Kevin Owens) was adapted as an Application Note for Mettler-Toledo. Lots of interesting things can be done easily when you publish in a journal with an Open Access option.


I just noticed that our number of views is almost at 8000 on JoVE. After a little over a year the views are still coming in at a fairly steady pace. Article-level metrics are one of the best things in the scientific publication process to have come along for authors.

Dangerous Data: Lessons from my Cheminfo Retrieval Class

I’m not sure what my students expected before taking my Chemical Information Retrieval class this fall. My guess is that most just wanted to learn how to use databases to quickly find “facts”. From what I can gather much of their education has consisted of teachers giving them “facts” to memorize and telling them which sources to trust.

Trust your textbook – don’t trust Wikipedia.
Trust your encylopedia – don’t trust Google.
Trust papers in peer reviewed journals – don’t trust websites.

If I did my job correctly they should have learned that no sources should be trusted implicitly. Unfortunately squeezing useful information from chemistry sources is a lot of work and hopefully they learned some tools and attitudes that will prove helpful no matter how chemistry data is delivered in the future.

I have previously discussed how trust should have no part in science. It is probably one of the most insidious factors infesting the scientific process as we currently use it.

To demonstrate this, I had students find 5 different sources for properties of chemicals of their choice. Some of the results demonstrate how difficult it can be to obtain measurements with confidence.

Here are my favorite findings from this assignment as a top 3 countdown:

#3 The density of resveratrol on 3DMET

Searching for chemical property information on Google quickly reveals the plethora of databases indexed on the internet with a broken chain of provenance. These range from academic exercises of good will to company catalogs, presumably there to sell products. Although it is usually not possible to find out the source of the information, you can sometimes infer the origin by seeing identical numbers showing up in multiple places.

But sometimes the results are downright bizarre – consider the number 1.009384166 as the density of resveratrol from what looks like a Japanese government site 3DMET. First of all no units are given but lets assume this is in g/ml. The number of significant figures is curious and suggests the results of a calculation, perhaps a prediction. In this case the source is from the MOE software. This is clearly a different algorithm from the one used by ACDLabs, which comes in at 1.356 g/ml, much more realistic when put up against all 5 sources:

  • 1.359 g/cm3 ChemSpider predicted
  • 1.36 g/cm3 (20 C) Chemical Book MSDS
  • 1.009384166 3DMed
  • 1.41 g/cm3 (-30.15C) DOI (found with the aid of Beilstein)
  • 1.359 g/cm3 LookChem

#2 The melting point for DMT depends on the language

I have to admit being really surprised by this. Even though I knew that Wikipedia pages in different languages were not exact translations I would have assumed that the chemical infoboxes would not be recreated. Interestingly, the German edition has a reference but I was not able to access it since it is a commercial database. The English edition has no specific references. Here is a list of sources:

#1 Solubility of EGCG in water

This is by far my favorite because it most clearly demonstrates the dangers of the concept of a “trusted source”. From the compilation prepared by the student, this paper (Kwang08) reported the solubility of EGCG at 521.7 g/l:

This is from a paper that spent 5 months undergoing peer review with a well respected publisher. Also it appeared recently so one would expect the benefit of the best instruments and comparison with historical values. But even beyond all of this, the numbers are in the opposite order to the point explained in the paragraph. In our system of peer review we don’t expect reviewers to verify every data point – but we do expect the text to be evaluated as logically consistent.

Now if we follow the reference provided for this paragraph we find the following paper (Liang06), with this:

We can now see what happened: the 21.7 was accidentally duplicated from the caffeine measurement and appended to the 5 g/l for EGCG. This is a lot more reasonable, even though I am not clear about where that number comes from in this second paper.

We can get some idea of the potential source of this information from the Specification Sheet for EGCG on Sigma-Aldrich:

Notice that this does not state that the maximum solubility of EGCG in water is 5 mg/ml – just that a solution of that concentration can be made. This value is repeated elsewhere, such as this NCI document, which references Sigma-Aldrich:
From here the situation gets muddled. Another search reveals this peer reviewed paper (Moon06), which appeared in 2006:
Expressed in mM this translates to about 2.3 g/l. Clearly this value is inconsistent with the Sigma-Aldrich report of being able to make a clear solution at 5 g/l.

Luckily, in this case we have some details of the experiments:

The measurements were done in triplicate and averaged. Unfortunately this does not reveal any sources of systematic error. One clue as to why these values are contradictory might be the method of dissolution. One hour sonication at room temperature might just not be enough to make a saturated solution for this compound. (Although one might expect the error to lie on the high side because the sample were diluted before being filtered) What would answer this definitively are the experimental details of how the Sigma-Aldrich source prepared the 5 g/l solution. If it went in within a few minutes without much agitation, that would be inconsistent with this hypothesis of insufficient mixing. In that case we would want to look at the HPLC traces in this paper for another type of systematic error.

Unfortunately, the chain of information provenance ends here. Just based on the data provided so far, there is significant uncertainty in the aqueous solubility of EGCG, similar to our uncertainty about the melting point of strychnine.

As long as scientists don’t provide – and are not required to provide by publishers – the full experimental details recorded in their lab notebooks, this type of uncertainty will continue to plague science and make the communication of knowledge much more difficult than it need be.

Unfortunately the concept of “trusted sources” is being used as a building block of some major chemical information projects currently underway – WolframAlpha and the chemical infobox data of Wikipedia are prime examples. Ironically, MSDS sheets are listed as a reliable “trusted source” for the infoboxes, when they have been shown to be very unreliable (see my previous post about this with statistics). These are probably one of the most dangerous sources of information because they appear to be trustworthy – coming from chemical companies and the government – and often found on university websites. Combine that with the absence of references or experimental details and the potential for replication of errors is very high and very difficult to correct.

WolframAlpha does have a mechanism to provide information about sources but it requires submitting a reason and personal information.
To see how this works in practice I made a request for the source of an entry with erroneous data – glatiramer acetate:
I submitted this 10 days ago and still don’t know the source.

Rapid access to specific sources is important for maximizing the usefulness of databases. Without that it becomes very difficult to assess the meaning of reported measurements and compare with results from other databases.

It is not possible to remove all errors from scientific publication. But that’s only a problem when it is difficult to determine that there are errors in the first place because insufficient information is provided.

Scientists can handle ambiguity. If you look at the discussion over the blogosphere concerning the JACS NaH oxidation paper, much of it was constructive. The publication of that paper was not a failure of science. Quite the opposite – we learned some valuable lessons about handling this reagent. As far as I can tell the paper was a truthful reporting of their results.

Where this was a failure lies in the way conventional scientific channels handled the matter. There was no mechanism to comment directly on the website where paper was posted. That would have been the logical place for the community to ask questions and have the authors respond. Instead the paper was withdrawn without explanation.

ONS Solubility Book: Edition 2 – with Predicted Values

The Second Edition (2009-12-27) of the Open Notebook Science Solubility Challenge book is now available. The issues with some missing text have been resolved, in addition to providing clickable links for the references in the PDF version.

However, the main difference is the addition of a new section on solubility predictions. The book is now somewhat larger than the first edition, coming in at 129 pages but still very affordable at $8.16 (covers printing and shipping costs from LuLu).

This was added to the preface:

Predicted Solubilities

In this edition, a new section is added to provide predicted solubility values for selected solutes in a range of solvents. Specifically, solutes are included when measurements from at least 5 different solvents are available. A method using Abraham descriptors depends on the experimental solubility measurements from several solvents to make predictions, which is detailed in that section of the book. For this reason, this edition also includes some aqueous solubility measurements, which are generally available from the literature. The focus of this collection remains on non-aqueous solubility.

Consistent with how the experimental measurements are made available, the predicted solubility values are provided as a work in progress. The purpose in providing them is to suggest solvents of interest for various applications. The boiling point of each solvent is also listed in the table to allow a convenient selection. When available, experimental measurements are listed next to the predicted values. This information can be helpful to gauge the usefulness of the model to some extent but does not guarantee its reliability for the other solvents. As more measurements are collected the reliability of the predictions is likely to increase and this will be reflected in future editions of this book.

Andrew Lang has been busily learning about building models using Abraham descriptors. As luck would have it, Michael Abraham just published an extensive collection of his descriptors for many solvents in a recent publication:

Abraham M.H.; Smith R.E.; Luchtefeld R.; Boorem A.J.; Luo R.; Acree Jr. W.E. Prediction of solubility of drugs and other compounds in organic solvents. J. Pharm. Sci. Early View Sept. 22 (2009) http://dx.doi.org/10.1002/jps.21922

This is an important step for the ONS Challenge project by taking us closer to the eventual goal of providing chemists an open tool for anticipating the solubility behavior of their reactants and products in a particular solvent. Researchers might think of trying new solvents after perusing their measured or predicted solubilization potential for a given solute.

We don’t know how good the predictions will turn out but we will certainly find out in the coming months and report as we go. Even though the Submeta awards have all been distributed we still welcome measurement contributions.

First Edition of ONS Solubility Challenge Book

Andrew Lang and I have been working on a book version of the Open Notebook Science Solubility Challenge database. The timing is good since we just awarded the last ONS Challenge Submeta award this month. All of the students, judges and educational partner are included as co-authors. A biography and picture of everyone is included in the book.

Jean-Claude Bradley, Associate Professor of Chemistry at Drexel University
Cameron Neylon, Senior Scientist at the ISIS Pulsed Neutron Source, Rutherford Appleton Laboratory and Lecturer in Chemical Biology at the School of Chemistry at the University of Southampton
Rajarshi Guha, Research Scientist at the NIH Chemical Genomics Center
Antony Williams, Vice President of Strategic Development, ChemSpider at the Royal Society of Chemistry
Bill Hooker, Postdoctoral Researcher in Molecular Biology
Andrew Lang, Professor of Mathematics at Oral Roberts University
Brent Friesen, Associate Professor of Chemistry at Dominican University
and
Tim Bohinski, David Bulger, Matthew Federici, Jenny Hale, Jenna Mancinelli, Khalid Mirza, Marshall Moritz, Daniel Rein, Cedric Tchakounte, and Hai Truong

We selected LuLu as a convenient mechanism to distribute copies. This 6 x 9 inches black and white soft cover edition is available for $5.96, which just covers the printing and shipping charges. Other formats are possible – such as a larger hardcover in color – but these are much more expensive. We thought it would be good to start with the most affordable version and look at other options later. The electronic version of the book is available for free on LuLu.

We were inspired by the style of the solubility book published by Atherton Seidell in 1919, freely available on Google Books. The compound entries are listed in alphabetical order, with tables of compound data and solubilities. We included data that we found to be useful for practical applications, including predicted density, room temperature phase and the solubility in molarity, mole fraction and g/100g solvent. References link to lab notebook pages or literature references.

Andy found a way to create the fully formatted book in an almost completely automated way, pulling the data directly from the Solubilities Summary and other Google spreadsheets and querying ChemSpider. The preface and biographies of the students, judges and educational partner are also automatically pulled in from Google Docs. With this system in place, it will be straightforward to publish future editions with the most updated information frequently.

This was also a good opportunity to make use of the WebCite service. It enables us to link the book to a frozen version of the Solubilities Summary sheet archived as an Excel spreadsheet. This format retains all the formulas and hyperlinks in the original Google Spreadsheet.

The preface further explains the scope of the book and project:

The Open Notebook Science Solubility Challenge

Solubility is an important consideration for many chemistry applications. Synthetic chemists usually use a solvent to perform reactions and knowledge of the solubility of the starting materials or products can be very useful to pick an appropriate solvent. Analytical chemists can use solubility to design separation techniques and factor in dynamic range considerations. Physical chemists can create and evaluate their models of how molecules interact in the solubilization and precipitation processes.

Solubility data can be obtained from a variety of online and offline sources. As with all chemical data, it can be a challenge to evaluate reported measurements. Some databases offer no references while others provide citations to peer reviewed journal articles. Given the choice, more weight is generally given to the latter. This is reasonable in most cases because more information about the purity of compounds and the methods used are available in peer-reviewed articles.

However, the information for how a specific measurement was obtained within a journal article is not generally provided. General methods are provided but the raw data for a specific measurement are typically not published. Peer review is not intended to validate individual measurements – its function is to ensure that the authors made appropriate conclusions based on their processed datasets and the state of knowledge in the field.

The Open Notebook Science Challenge was initiated in the fall of 2008 as the result of a discussion on a train in the UK between Jean-Claude Bradley and Cameron Neylon.[1,2] The concept was very simple: create a crowdsourcing opportunity for the chemistry community to contribute solubility measurements under Open Notebook Science conditions. This method of publication entails providing immediate public access to the chemist’s laboratory notebook, as well as all raw data used to compute the measurements.[3,4]

On Sept 3, 2008 the first ONSC measurements were recorded by Bradley and Neylon at the University of Southampton in Neylon’s laboratory.[5] The project was soon sponsored by Submeta, offering ten $500 awards for students in the US or the UK who best recorded how they performed their experiments.[6] Furthermore, the first 3 winners also received one year subscriptions to Nature magazine, thanks to a sponsorship from the Nature Publishing Group.[7] Sigma-Aldrich supported the contest by donating chemicals upon request.[8]

Students were evaluated by a group of judges who convened once a month to deliberate the next award. Judges also provided feedback to the students by commenting on their lab notebook pages directly on the wiki. Their expertise ranged from chemistry to mathematics, spectroscopy and molecular biology.

Techniques

Participants in the ONS Challenge were not required to use a specific method to measure solubility – although they were required to properly document their experiments and analyses. Due to its simplicity, most measurements in the past year were made using the SAMS NMR technique, requiring no volume measurement or calibration curves.[9] Two assumptions are made with this method. The first is that the volume of solute and solvent are additive, with the error becoming negligible at low solubility values. The second is that NMR integration values are proportional to the amount of solvent and solute. Some deviations from this have been observed for default NMR parameters and in later experiments long relaxation times are introduced into the protocol (D1 = 50s).[10]

Data Curation

Since an Open Notebook approach is used in this work, those interested in the validity of the measurements can assess the methods used – both for the preparation of saturated solutions and the raw data from the measurements. Over time, values in the database are likely to improve and possibly some errors may be uncovered and corrected. However, on the whole, we feel that the values provided in this work should be of use to chemists trying to gain an appreciation of solubility for most applications. This is especially the case for values that are not obtainable from any other source.

When clearly erroneous data points are discovered, they are flagged in the database as “DONOTUSE”. This way interfaces with the dataset can ignore these values while allowing anyone to investigate why the data points were flagged. This might happen when early experiments did not allow for sufficient mixing or NMR D1 relaxation times were long enough to fully integrate peaks of interest. Out of 681 reported measurements, 51 are currently marked in this way. A shared Google Spreadsheet is used to collect and curate the dataset. This allows easy data entry while providing a simple way to interrogate the database for visualization applications via the Google API.[11]

Literature data and format conversions

An additional 400 solubility measurements from the literature are included in the database. These generally correspond to compounds that are structurally identical or similar to the compounds measured by the ONS Challenge participants. These values are averaged in with the values from the participants, with appropriate references provided. In order to compare values, conversions from molar fraction or g solute/100g solvent to molarity were made by assuming that the volumes are additive and obtaining the density of the solutes in most cases from the predicted values in ChemSpider.[12]

For the convenience of chemists with diverse applications, all three formats are provided. For the cases where solutes are miscible with the solvent, the molarity reported is simply the solute’s density. The practical interpretation of this is that solutions of any molarity below the solute’s density can be prepared.

In the process of converting units and averaging heterogeneous data sources, no attempt has been made to track significant figures. Those interested in any information about the precision of measurements should consult each individual data source. This may not be an easy task for measurements only carried out once and where factors such as the quality of spectral peaks and baselines are not optimal.

This collection will be most valuable for those who do not require highly precise measurements for their applications. For example, synthetic chemists can easily use rough estimates of solubility to select appropriate solvents for a reaction. In any case, one would be wise to consider all measurements as provisional, regardless of the source. As more data are collected, subsequent editions of this book will adjust values accordingly.

Searching the database

The values in this database can be accessed and filtered in various ways. More information is available at the ONS Challenge wiki[13] and Chapter 16 of the book “Beautiful Data”.[14]

Database version

Archived as Excel Spreadsheet by WebCite on December 11, 2009.[15]

References

[1] Bradley, JC Open Notebook Science Challenge, UsefulChem blog (2008) http://usefulchem.blogspot.com/2008/09/open-notebook-science-challenge.html
[2] Open Notebook Science Challenge Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science_Challenge
[3] Bradley, JC Open Notebook Science, Drexel CoAS E-Learning Blog (2006) http://drexel-coas-elearning.blogspot.com/2006/09/open-notebook-science.html
[4] Open Notebook Science Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science
[5] Bradley, JC; Neylon, C UsefulChem Experiment 207 http://usefulchem.wikispaces.com/Exp207
[6] Bradley, JC Submeta Open Notebook Science Awards, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/submeta-open-notebook-science-awards.html
[7] Bradley, JC Nature Sponsors Open Notebook Science, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/nature-sponsors-open-notebook-science.html
[8] Bradley, JC Sigma-Aldrich First Official Sponsor of Open Notebook Science Challenge, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/09/sigma-aldrich-first-official-sponsor-of.html
[9] Bradley, JC Semi-Automated Measurement of Solubility, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/03/semi-automated-measurement-of.html
[10] Bradley, JC NMR Integration Progress for Solubility Measurements, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/06/nmr-integration-progress-for-solubility.html
[11] Bradley, JC Interactive Visualization of ONS Solubility Data, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/01/interactive-visualization-of-ons.html
[12] ChemSpider database http://www.chemspider.com
[13] ONS Challenge List of Experiments Page http://onschallenge.wikispaces.com/list+of+experiments
[14] Bradley, J.-C.; Guha, R.; Lang, A.S.I.D.; Lindenbaum, P; Neylon, C.; Williams, A.J. & Willighagen, E. Chapter 16: Beautifying Data in the Real World from Beautiful Data. O’Reilly Media, Eds: Segaran, T. & Hammerbacher, J. (2009)
[15] Bradley, Jean-Claude; Lang Andrew. Solubilities Summary Sheet. Open Notebook Science Challenge. 2009-12-11. URL:http://spreadsheets.google.com/pub?key=plwwufp30hfq0udnEmRD1aQ&output=xls. Accessed: 2009-12-11. (Archived by WebCite® at http://www.webcitation.org/5lx5ry3BV)

Hai Truong is Dec09 Submeta ONS Award Winner

Hai Truong, working under the supervision of Jean-Claude Bradley at Drexel University, is the December 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Hai mainly collaborated with Khalid Mirza to try to understand co-solute effects for Ugi products in benzene. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

This was the final Submeta ONS Award for 2008-9. We would like to thank all the sponsors – Submeta, Nature Publishing Group and Sigma-Aldrich – for making this project a reality. A summary of the results from the past year will be published shortly.

For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Communicating Chemistry

In October 2008, I participated in an NSF workshop on eChemistry: New Models for Scholarly Communication in Chemistry. Theresa Velden and Carl Lagoze have now published their reports. Here are the details from their press release:

Public Release of White Paper: The Value of New Scientific Communication Models for Chemistry

Ithaca, NY, November 23, 2009 – The results of a National Science Foundation sponsored workshop in October 2008 are now available in a white paper ‘The value of new scientific communication models for Chemistry’, publicly accessible at http://hdl.handle.net/1813/14150. An article ?Communicating Chemistry?,summarizing this white paper, is published in the December issue of Nature Chemistry at http://www.nature.com/nchem/journal/v1/n9/full/nchem.448.html.

This white paper is intended as a starting point for discussion on the possible future of scientific communication in chemistry, the value of new models of scientific communication enabled by web-based technologies, and the necessary future steps to achieve the benefits of those new models. It opens with an overview of publishing reform and e-science initiatives in other disciplines, such as open access, data publishing, and preprint servers. Following this, it reviews the scientific communication system in chemistry, including the established system of journals and databases, and recent web-based innovations and experiments. Next, it analyzes the distinguishing aspects of chemistry that may influence its communication practices and have an impact on the manner in which science communication in chemistry will further evolve.

The white paper concludes with a call for a more comprehensive symposium on this subject. In recognition that the analysis presented in the white paper is yet incomplete, and provides only a starting point for discussion, the proposed international symposium would engage a broad range of participants who would expand on the subjects introduced in the white paper and issue calls for actions and research initiatives. Work on finding funding for this symposium is now in progress.

Members of the chemistry community and other interested parties are encouraged to join in a critical and constructive assessment of the content of the white paper and the issues it addresses. An online forum for this community discussion has been set up at http://groups.google.com/group/echem-white-paper. Other venues for discussion at conferences and workshops are being planned.

CAS curates strychnine m.p. – ChemInfo Class 9

What is going to distinguish chemistry databases as we move forward in this Web2.0 world?

If I was unsure of it when I started teaching Chemical Information Retrieval 2 months ago, I certainly got my answer yesterday afternoon. Cristian Dumitrescu from CAS contacted me to discuss the problems I had encountered when attempting to use SciFinder to find the melting point of strychnine. He had read my blog post and wanted to make sure he understood the problem. So I had a conference call with him and a CAS colleague and I explained that several m.p. values corresponded to strychnine salts instead of the free base. They agreed to rectify the situation.

Apparently Cristian stays on top of what is being said about CAS products from various sources, including the blogosphere. I think that what will distinguish chemistry databases as we move forward is precisely this type of proactivity and responsiveness.

There are a plethora of databases out there to search for chemical information. Most of them contain surprisingly significant amounts of incorrect data. My students are in the process of demonstrating that with their assignment on finding 5 sources for 5 properties of a chemical of their choice. When they are done in 2 weeks I’ll post about that, perhaps doing a top 10 worst data points.

CAS is an example of a commercial database. But the same principle applies to free databases as well.

Consider the glatiramer acetate problem I reported on previously. ChemSpider immediately removed the entry because a random polymer was being incorrectly represented as a physical mixture of amino acids. As far as I know no other free databases have corrected the problem, although contact information for people running various databases was provided by Michael Kuhn and Egon Willighagen on FriendFeed.

I spoke with Cristian about the problem and he said he would look into it. Upon doing a search for glatiramer acetate on SciFinder it appears that there is currently a problem. The text correctly explains that this is a polymer but the empirical formula looks like just a physical mixture of amino acids, with an extra H2O per unit that should not be there after amide formation. But this was minor compared to the problems I reported on previously – for example there were no incorrectly calculated molecular properties, although the images did not represent the structure of the polymer.
This has been a good week for curation. Yesterday Nick successfully completed the evaluation of the stereochemistry of nargenicin and submitted the corrected SMILES to ChemSpider. Tony Williams has already incorporated the fix and now a search for nargenicin on ChemSpider gives just one entry.

Tony has provided several such puzzles for my students and a few are close to resolving the structures. The main problem is that the structures were entered into ChemSpider with at least one undefined stereocenter. Finding the correct structure from the primary literature can be very challenging for structures of this complexity but it certainly puts the chemical information retrieval methods I am teaching my students to good use.

The class itself was short – and covered mainly just details of student assignments – since we won’t have much time during the last class on December 3, 2009 for a workshop. Rajarshi Guha and Tony Williams will be my guest lecturers on that day.

Cheminfo Retrieval 8th class

This is the lecture from the 8th Chemical Information Retrieval class at Drexel University on November 12, 2009. It starts with a demonstration of how to use of ChemSketch and Chemspider to display and manipulate chemical structures, especially those with complicated stereochemistry. Technical issues with using SMILES between the two platforms are addressed, as are optimization of 3D structures and inverting chiral centers. Microsoft Paint is used to process screen captures to images that can be uploaded to Wikispaces. ChemSpider is also used to generate predicted properties. SDBS is used to retrieve NMR and other spectroscopic data.

Mel Reichman’s Drug Discovery Talk

Mel Reichman gave an outstanding presentation at Drexel on November 12, 2009. I think many of our faculty and students benefited from his unique perspective on high throughput drug discovery and the story of Vioxx from both chemistry and intellectual property considerations.

Unfortunately the resolution of the screen was changed during the presentation because the projector was not working properly. As a result some of the screen capture video got de-centered. I’m embedding the slides as well to see all the details.

Mel Reichman, senior investigator and director of the LIMR Chemical Genomics Center at the Lankenau Institute for Medical Research presents at the chemistry department at Drexel University on November 12, 2009. Introduction by Jean-Claude Bradley.

Modern drug discovery by high-throughput screening (HTS) begins with testing hundreds of thousands of compounds in biological assays. The confirmed hit rate for typical HTS is less than 0.5%; therefore, 99.5% of the costs of HTS are for generating null data. Orthogonal convolution of compound libraries (OCL) is 500% more efficient than present HTS practice. The OCL method combines 10 compounds per well. An advantage of this method is that each compound is represented twice in two separately arrayed pools. The potential for the approach to better enable academic centers of excellence to validate medicinally relevant biological targets is discussed.

Liz Lyon on Open Science at web-scale

Liz Lyon from UKOLN has just published a JISC report on Open science at web-scale: Optimising participation and predictive potential. This is a very thorough 45 page document that will serve the Open Science community well as a reference for supporting open initiatives. UsefulChem and Open Notebook Science are covered in a balanced way I think.

This report has attempted to draw together and synthesise evidence and opinion associated with data-intensive open science from a wide range of sources. The potential impact of data-intensive open science on research practice and research outcomes, is both substantive and far-reaching. There are implications for funding organisations, for research and information communities and for higher education institutions.

The original specification for the work was highly selective in its choice of areas to study, and this Report addresses only three of these areas in any depth:

* open science including open notebook science : making methodologies, data and results available on the Internet, through transparent working practices
* citizen science including volunteer computing : where volunteers who may not have scientific training, perform or manage research-related tasks such as observation, measurement or computation
* predictive science : data-driven science which enables the forecasting, anticipation or prediction of specific outcomes.

Mel Reichman on Pool Shark’s Cues for More Efficient Drug Discovery

The Drexel Department of Chemistry Seminar Series presents “Pool Shark’s Cues for More Efficient Drug Discovery” on Thursday, November 12, 2009 at 4:30 p.m. in Disque Hall room 109 (32nd Street between Market and Chestnut Streets). Mel Reichman, senior investigator and director of the LIMR Chemical Genomics Center, the Lankenau Institute for Medical Research, is the guest speaker.

Modern drug discovery by high-throughput screening (HTS) begins with testing hundreds of thousands of compounds in biological assays. The confirmed hit rate for typical HTS is less than 0.5%; therefore, 99.5% of the costs of HTS are for generating null data. Orthogonal convolution of compound libraries (OCL) is 500% more efficient than present HTS practice. The OCL method combines 10 compounds per well. An advantage of this method is that each compound is represented twice in two separately arrayed pools. We will discuss results and the potential for the approach to better enable academic centers of excellence to validate medicinally relevant biological targets.

Rhenium – superalloys and industrial catalysts

In this week’s Chemistry in its element podcast, Eric Scerri (UCLA) talks about one of the most useful, but also rarest and densest, elements

 

 

Sing the Periodic Table of Elements

Is there an easy way to memorise all 112 elements? Yes, there is. You could make up a melody, and sing them.

Melody is a great mnemonic device. The idea was used by Carleton University professor Bob Burk, to encourage his students’ interest in Chemistry. He would give extra marks to those students who could memorize all 112 elements, and sing them in front of a 500+ class. Great idea, and very entertaining.

Ed. Of course, one of the most famous ‘element songs’ was composed by Harvard mathematics professor and all-round entertainer Tom Lehrer, first recorded in 1959 to the tune of Gilbert and Sullivan’s Major General’s song. However, as the video below shows, this only includes elements up to and including Nobelium (element 102), although Lehrer does include the catch-all line:

‘These are the only ones of which the news has come to Harvard,
There may be many others but they haven’t been discoooooovered.’

Since then, at least ten new elements have indeed been discoooovered, including the newly ratified Copernicium, Cn, so perhaps we should have an updated song.

From a chemist’s point of view, Lehrer’s tune is very catchy, but doesn’t help with remembering which element goes where. It would be great to have songs in which the elements were sung in order (such as the Russian version above), or in vertical groups – maybe even with a bit of fill-in lyric about the properties of each group.

We’d love to hear from anyone out there who knows (or wants to compose) such a tune – if you post them on YouTube let us know and we can feature them on this thread!

Chemistry World’s weekly round-up of money and molecules

PerkinElmer has proudly congratulated the Brawn GP racing team and driver Jenson Button on winning the Formula One Constructors’ and Drivers’ Championships respectively. The Brawn team used various PerkinElmer instruments to test the car’s performance and reliability.

PerkinElmer’s Optima 5300V inductively coupled plasma (ICP) instrument was used to monitor engine and gearbox wear by detecting metal content in lubricants, and its Spectrum 100 fourier-transform infrared spectrometer to monitor the degradation of seals and analyse organic debris from engine and gearbox lubricants.

PHARMACEUTICAL

The new Merck emerges

Following the completion of its reverse merger with Schering-Plough, the new Merck has emerged from its chrysalis and its chief executive, Richard Clark, announced that it still has a ‘fat wallet and plans more wheeling and dealing’.

The new company currently has 106,000 employees, but is expecting to shed around 15 per cent of those (15,000 jobs) ‘from all areas across the combined company’ to reduce its cost base by $3.5 billion a year.

GSK launches world’s largest malaria vaccine trial

GlaxoSmithKline (GSK) has launched the world’s largest malaria vaccine trial on its RTS,S vaccine (featured in this Chemistry World feature article). The trial has so far enrolled more than 5000 children in seven different sub-Saharan African countries and aims to enroll a further 11000.

‘A malaria vaccine could help save countless lives and redefine the future for Africa’s children,’ said Patricia Njuguna, RTS,S principal investigator and chair of the Clinical Trials Partnership Committee that is leading the clinical development of RTS,S. ‘Communities all across Africa are dedicated to this future and are participating to ensure that we develop a vaccine with an acceptable safety and efficacy profile.’

‘This is a tremendous moment in the fight against malaria and the culmination of more than two decades of research, including 10 years of clinical trials in Africa,’ said Joe Cohen, co-inventor of RTS,S and vice president of R&D, vaccines for emerging diseases and HIV, at GSK Biologicals.

Novartis goes east

Novartis has said it will invest $1 billion (£603 million) over the next five years to build ‘the largest pharmaceutical R&D institute in China’ in response to the country’s increasing demand for healthcare. The company estimates that the move will increase the number of research associates it employs at the the Novartis Institute for BioMedical Research in Shanghai (CNIBR) from 160 to over 1000.

The company has also spent $125 million on buying an 85 per cent stake in  the Chinese vaccine maker Zhejiang Tianyuan Bio-Pharmaceutical Co. to expand its ‘limited presence in this fast-growing market segment’.

Quintiles and AZ tie the knot

Contract research organisation Quintiles is to ‘assume the operational responsibilities for the majority of AstraZeneca’s (AZ) clinical pharmacology delivery’.  According to Anders Ekblom, executive vice president for Global Drug Development at AstraZeneca, ‘this model gives us access to the right scientific and medical expertise plus the quality, flexibility and capacity we need to work efficiently and cost-effectively to deliver these studies.’

Takeda and Amylin ‘fight the fat’

Takeda has agreed to licence various obesity drug candidates from Amylin in a deal worth up to $1 billion. The deal includes pramlintide/metreleptin and davalintide, which are currently negotiating their way through Phase II development. Amylin will receive a one-time, up-front payment of $75 million from Takeda as well as various milestone payments.

According to said Yasuchika Hasegawa, Takeda’s chief executive, ‘both Amylin and Takeda have extensive experience in the diabetes and metabolic disease area, and this collaboration should allow us to more quickly bring promising new treatments to patients in need.’

ViiV launches

GSK and Pfizer’s HIV joint venture that was announced in April this year has been officially launched. According to Dominique Limet, ViiV’s chief executive,  ‘our ambition is to conduct research and development both inside and outside ViiV Healthcare.  Our R&D efforts, strategic partnerships and licensing opportunities will be focused on delivering medications that help address resistance issues and dosing complexity.  Within our own pipeline we have some very exciting molecules, including our late stage integrase inhibitor development programme.’

INDUSTRY

Ineos considers a bio-refinery future

Ineos Bio has begun a £3.5 million feasibility study into whether its Seal Sands site in the Tees Valley, UK is suitable for a commercial bio-ethanol and bio-energy plant that will used biodegradeable household waste as a feedstock. The study is being supported by a £2.2 million grant from the regional development agency One North East and the UK’s Department for Energy and Climate Change.

‘This is a very exciting project. Converting household organic wastes into bio-fuel and clean energy can deliver very attractive environmental and social benefits to the North East and the UK as a whole,’ said Peter Williams, chief executive of Ineos Bio. ‘Essentially, our aim is to provide bio-fuel for cars and bio-energy at competitive cost without harming the environment, with very low or zero net carbon emissions and without competing with food production.’

The technology was featured in greater detail in a Chemistry World feature article published in April.

OSHA fines BP for refinery blast

The US Occupational Safety and Health Administration (OSHA) has slapped BP with an $87.4 million fine for failing to correct potential hazards at its refinery in Texas City, Texas following the fatal explosion that occured at the site four and half years ago. 15 people died and 170 were injured following the explosion at the refinery – the third largest in the US.

The fine is the largest in OSHA’s history, with the second largest of $21 million being the fine it imposed on BP following the original incident. The company has already paid more than $2 billion to settle civil lawsuits and paid a $50 million fine to the US Justice Department to settle criminal charges related to the blast.

BP has said it is appealing against this latest fine.

ESG wages war on contamination

Following the acquisitions of Environmental Services Group in 2006 and Scientifics in 2007, testing and inspection group Inspicio has combined the companies and launched the Environmental Scientifics group (ESG). The new group will provide testing, analysis and consultancy services across a range of fields from forensics and commodity chemical analysis to environmental monitoring.

At the launch in the London’s Cabinet War Rooms, David Watson, managing director for the newly created Laboratories and Analytical Services division told Chemistry World that the new structure brings together all of the companies’ services under a single management structure.

Shell slashing jobs as profits plummet

Shell is cutting 5000 jobs, around 10 per cent of its workforce, as part of its previously announced plan to streamline the business, which saw third quarter earnings slump 73 per cent to $2.9 billion compared to the same period last year.

Shell’s chief executive, Peter Voser, said the company’s results ‘were affected by the weak global economy. Upstream and Downstream profitability has been sharply reduced compared to year-ago levels.’

The company did not say how many job cuts would be made from its chemicals business, which saw chemical sales volumes fall 5 per cent compared to the same period last year.

‘We continue to focus on improving our competitive cost position, simplifying Shell, and increasing personal accountabilities. The Transition 2009 programme, which I announced earlier this year, is progressing well, and will be completed by the end of 2009. Some 5,000 employees are leaving Shell as a result of these changes. This represents around a 10% reduction in employees in the redesigned divisions and corporate functions,’ said Voser.

‘We have reduced operating costs by some $1 billion in the first nine months of 2009 compared to the same period in 2008. This reduction excludes the impact of exchange rate movements and non-cash pension costs.’

Rhodia on the up as others still in the slump

Rhodia has seen its third quarter operating profit increase 19.5 per cent year-on-year to €104 million (£93 million) despite sales falling 15 per cent to €1.04 billion. The increase in profitability was due to the company’s cost saving drive which has reduced fixed expenditures by €96 million year so far this year.

‘In Q3, our results continued to improve substantially, especially in our Polyamide and Silcea activities. This was due not only to a significant recovery in demand driven by emerging markets, but also to our ability to defend margins and our enhanced operational efficiency,” said Rhodia’s chief executive, Jean-Pierre Clamadieu.

‘We anticipate that demand in Q4 will remain similar to the Q3 level. I am convinced that we are today well prepared to emerge stronger from the crisis.’

But the news across the sector is not all so rosy – DSM’s third quarter operating profits fell 41 per cent to €139 million with sales falling to €2.02 billion, 14 per cent down on the prior year’s result. However, despite the gloom its operating profits were more than double those achieved during the second quarter of this year.

Total’s chemicals business also saw a fall in revenues and profits in the third quarter, with sales dropping 28 per cent year-on-year to €3.89 billion and operating profits falling 44 per cent to €191 million. However, both these figures were an improvement on the company’s second quarter results with sales increasing 6 per cent and operating profits more than tripling.

Matt Wilkinson and Phillip Broadwith

Sound bites from the UK nanotech community

Early this week I attended the Nano and emerging technologies forum in London, a networking conference where the UK’s nanotechnology community got together to discuss the state-of-the-art in this field.

Tony Ryan from the University of Sheffield opened the meeting with the statements: ‘the UK is the powerhouse of nanotechnology’ and ‘nanotechnology and the UK are in a good position to tackle the global grand challenges.’ And this was the message repeated by many others throughout the event.

And the cash is certainly there to support these scientists, with the UK’s research councils ploughing £50 million per year into the area and the Technology strategy board and various centres pledging to add a further £170 million to the pot over the next five years.

I thought the best way to give you a taste of the wide range of things I learnt this week was through sound bites from the best talks:

‘Nanobots would realistically need to look more like sperm, not submarines as they are often portrayed in the media’ according to Ryan. This is because they would need tails to allow them to ‘swim’ through our blood vessels. The UK community is currently trying to develop a viable medical robot funded by the EPSRC (Engineering and physical science research council) grand challenge in healthcare.

‘200 different chemicals are exhaled in our breath,’ Victor Higgs, director of Applied Nanodetectors. These can be used to monitor and diagnose diseases such as asthma and diabetes. His company is currently developing a mobile phone with a sensor that can detect nitric oxide levels in the breath of asthmatics. More on this device will be available on our homepage tomorrow.

‘We should develop nanotechnology that fulfils the needs of tomorrow, not the day after tomorrow,’ Christos Tokamanis from the European Commission. He says that thinking too far ahead will leave us with technology gaps that we can’t fill.

‘Biotech firms are hitting the buffers,’ Jonathan Heppe, a partner at venture capital firm Seroba-Kernel Life Sciences. He was talking about the current financial crisis meaning that venture capitalists are struggling to raise new funds, so most are focusing on their existing portfolios and not investing in new projects. He did however provide some data to suggest we might be turning a corner.

‘We can’t just shove nanoparticles into existing systems, we have to think very differently,’ Bruce Jefferson, Cranfield University. He was talking about the TiO2 nanoparticle-based system he is developing for waste water treatment. Price is one of the major hurdles he is struggling to overcome: it needs to cost £0.10 per cubic metre of water treated and is currently at a whopping £100 per cubic metre!

‘Current systems are not competitive in mass markets due to high costs and unproven reliability,’ Robin Francis from the Carbon Trust on polymer fuel cells. The carbon trust is currently inviting grant applications for the £6 million the Carbon Trust will be investing in this area over the next five years.

‘Carbon dioxide could be turned back in to fuel,’ Peter Edwards, University of Oxford. His suggestion is that CO2 could be used as a carbon source to create new hydrocarbon fuels, as an alternative to carbon capture and storage. This process is called tri-forming and will involve turning a mixture of CO2, NO2 and methane into petrochemicals. His team are currently investigating different catalysts to do this including nano-Co/Al2O3

Nina Notman

Indium – LCD televisions, computer monitors and solar cells

In this week’s Chemistry in its element, Claire Carmalt from University College London talks about the metal that cries when it is heated and is vital for modern day living.