09 Jun 2023
Can ChatGPT replace the Analytical Chemist?
The use of artificial intelligence is hot topic in a variety of workplaces and analytical chemistry is no exception. In fact, the advantages of computing power have led to improved workflows in many areas such as data analysis, statistical analysis, decision making and modelling due to the enormous amounts of data that studies can output. The –omics have exploded in their areas thanks in large part to the advancement in computing in the 21st century. Humans can be subject to bias and variability in their work, but AI can miss contextualized evidence, and lack creativity (or thinking outside the box), so is the role of the Analytical Chemist expert at risk?
We put several of the most common question put to search engines and ask-the-expert forums to see what the differences are between an AI response and an expert here at Element.
What is the difference between capillary column and packed column?
Capillary columns and packed columns are two types of chromatographic columns used in gas chromatography. The main differences between the two are shown in the tabs below.
ChatGPT: Capillary columns are narrow, fused-silica tubes with an internal diameter of less than 0.5 mm, while packed columns are wider tubes (typically 2-4 mm internal diameter) filled with a packing material, such as silica gel or molecular sieves
Our engaged expert: Capillary columns can in fact go above 0.5mm i.d, these are commonly known as megabore columns, and they allow loading of larger amounts of sample.
ChatGPT: Capillary columns offer higher separation efficiency due to their smaller internal diameter, which results in a larger surface area for interaction between the sample and the stationary phase. Packed columns have a lower separation efficiency due to their larger particle size and less surface area for interaction.
Our engaged expert: We are not comparing like with like – capillary columns have a liquid phase bonded to the inside diameter of the column, packed columns the liquid phase is bonded to the particles.
In addition, the efficiency in narrower ID capillary columns isn’t a feature of surface area, it’s about how many interactions the analyte has with the stationary phase chemistry – like bouncing a tennis ball along a drainpipe or bouncing a tennis ball through a much larger pipe (e.g. the water pipe for a hydroelectric power station) – the ball hits the sides much more often in a narrower tube, allowing the stationary phase many more opportunities to work its magic.
ChatGPT: Packed columns have a larger sample capacity compared to capillary columns due to their wider internal diameter and the larger volume of the packing material. This makes packed columns better suited for the analysis of complex mixtures.
Our engaged expert: Packed columns do in fact have a higher sample capacity, but sample capacity and efficiency are two different things. Complex mixtures need higher efficiency – which you get from capillary columns.
ChatGPT: Capillary columns require a lower flow rate of the mobile phase (i.e., carrier gas) compared to packed columns due to their smaller internal diameter. This makes capillary columns better suited for the analysis of small sample volumes.
Our engaged expert: Capillary columns work better with small volumes, that is true, but that has nothing to do with flowrate, that is defined by the column capacity – which is primarily related to ID and film thickness
ChatGPT: Capillary columns offer higher sensitivity compared to packed columns due to their smaller internal diameter and higher separation efficiency, which results in better separation of analytes and less interference from other compounds.
Our engaged expert: Sharpening peaks will mean the same peak area is shown in a much narrower peak width which leads to improved S/N ratio.
ChatGPT: Capillary columns offer higher separation efficiency, lower sample capacity, and higher sensitivity, while packed columns offer lower separation efficiency, higher sample capacity, and lower sensitivity. The choice between the two depends on the nature of the sample and the desired analytical outcome.
Our engaged expert: It is interesting that it says that packed columns are better for complex mixtures which I would disagree with.
Also, the sensitivity point would be better expanded on by saying it does it by peak shape.
I have peaks appearing in my HPLC (High Performance Liquid Chromatography) blank at the same retention time as my previous injection, why?
If peaks are appearing in your HPLC blank at the same retention time as your previous injection, it is likely that the contamination is originating from the sample or from the instrument.
One possibility is that there is carryover from the previous injection, where some residual sample may have been left in the column or in the injection system. This can result in ghost peaks or carryover peaks that appear in the blank at the same retention time as the previous injection. To avoid carryover, it is important to flush the column and the injection system with an appropriate wash solvent between injections, as well as to properly clean and maintain the instrument.
Another possibility is that the blank or the solvent used to prepare the blank may be contaminated. It is important to use high-quality solvents and blanks, and to properly prepare and store them to avoid contamination.
Additionally, some instrument-related factors, such as column bleed or detector noise, may also result in spurious peaks in the blank. These factors can be minimized by optimizing the instrument conditions and ensuring that the instrument is properly maintained and calibrated.
To troubleshoot this issue, it is recommended to perform a series of blank injections to identify any potential sources of contamination. If peaks continue to appear in the blank, it may be necessary to investigate the instrument and sample preparation procedures to identify the source of the contamination and take appropriate measures to eliminate it.
Our engaged expert
Troubleshooting a problem is often a lot more difficult than fixing a problem and it is often beneficial to not just fix it, but to identify its cause to prevent it from happening again. This in the long term saves money on consumables, re-analysis, and delays.
I would approach this from a logical point of view, where I can rule out possibilities one at a time to narrow down the probable causes and try and perform the simplest tasks first.
First I would determine if this is actually an issue or it might be one of those weird system quirks that instruments like to throw at you at 4:58pm. Running the mobile phase through the system for an extended period of time and see if the peak(s) are still present and if they are, check if they are consistent in area, shape, retention time. Every minor detail adds to the solution.
If the peaks are still present, start to rule out parts of the HPLC. For example, running a system blank without an injection can determine if the contamination is coming from the vial (both sample and diluent) or the instrument. This is a quick, simple, and inexpensive test that yields a lot of information.
Keep bypassing parts of the system until the peaks are no longer observed. If the peaks are present when the autosampler is connected, but absent when it is not, there is compelling evidence that the autosampler is the source and then it can be narrowed down even further to the specific component.
The question itself supplies useful information. Peaks with consistent retention times are commonly referred to as ghost peaks and their sources are different from carryover peaks which tend to have variable retention times and suffer from band broadening. This can remove sources without the need to carry out expensive and or time-consuming diagnoses.
How do I choose the appropriate weighting factor for my calibration?
Choosing the appropriate weighting factor depends on the nature of your data and the goals of your analysis. There are several weighting factors commonly used in weighted regression, including:
1/y: This weighting factor gives more weight to data points with smaller y values, which can be useful when the uncertainty in the dependent variable is proportional to its magnitude.
1/x: This weighting factor gives more weight to data points with larger x values, which can be useful when the uncertainty in the independent variable is proportional to its magnitude.
1/x^2: This weighting factor gives more weight to data points with larger x values and is useful when the error in the independent variable is proportional to the square of its magnitude.
1/y^2: This weighting factor gives more weight to data points with smaller y values and is useful when the error in the dependent variable is proportional to the square of its magnitude.
Other custom weighting factors: You can also define your own weighting factor based on your understanding of the data and the specific problem you are trying to solve.
In general, the appropriate weighting factor depends on the nature of the data and the problem you are trying to solve. For example, if you have data with a high degree of scatter, you may want to use a weighting factor that gives more weight to data points with smaller errors, such as 1/y^2. If you have data with high uncertainty in the independent variable, you may want to use a weighting factor that gives more weight to data points with larger independent variable values, such as 1/x. The choice of the weighting factor should be guided by your understanding of the data and the goals of your analysis.
Our engaged expert
I thought the question was “How” do I choose, not “What” are the choices. In that sense, I think the AI has failed to answer.
It also has failed to list all the possible weighting options. The “original” weighting factors were in fact 1
It also has failed to list all the possible weighting options. The “original” weighting factors were in fact 1/s², the variance on the response.
Although the 1x and 1/x² options are available in pretty much every piece of chromatography software, they should not actually be there, because they should not apply to calibration curves!
1/x is not applicable to a calibration curve, where there is no uncertainty in the ‘x’ (that is one of the fundamental assumptions in simple regression).
OK, you may say that in a calibration we back-calculate the concentration of the standards and get an error in ‘x’. That is correct, but that error does not reflect the accuracy of the preparation of those standards, simply the lack of fit of the model, which is caused by errors in ‘y’. So, the AI failed to point that out.
Finally, it does not answer the “How” question.
The answer should have the word “heteroscedasticity” somewhere. That is a serious omission by the AI. Homoscedasticity means that error in ‘y’ is independent of the magnitude in ‘x,’ and that is another fundamental assumption in simple regression. If this condition is not met, then simple regression is not applicable, and weightings should be used.
There are many tests to determine heteroscedasticity, one of which is the Levene’s test but going into that is a whole extra topic!
How much CaCO3 do I add to 100mL to make a 10ppm Ca solution for ICP (Inductively Coupled Plasma) analysis?
To prepare a 10 ppm Ca solution for ICP (Inductively Coupled Plasma) analysis using CaCO3 as a source of Ca, you need to calculate the amount of CaCO3 required based on its molecular weight and the desired concentration.
The atomic weight of Ca is 40.08 g/mol.
To prepare a 10 ppm Ca solution, you need to add 10 mg of Ca to 1 litre (1000 mL) of water. For 100 mL of water, you need to use a proportionate amount of Ca.
The calculation can be done as follows:
10 ppm = 10 mg/L = 10 mg/1000 mL = 0.01 mg/mL
0.01 mg/mL x 100 mL = 1 mg
1 mg Ca x (1 mole Ca / 40.08 g Ca) x (1 mole CaCO3 / 1 mole Ca) x (100.09 g CaCO3 / 1 mole CaCO3) = 0.02499 g CaCO3
Therefore, you need to add approximately 0.02499 grams of CaCO3 to 100 mL of water to prepare a 10 ppm Ca solution for ICP analysis.
Our engaged expert
This is where AI can shine as it is simple mathematics that can be easily messed up by humans and converting between units is a common cause of lab errors.
I would also do the first part the same, convert from ppm to the unit that is suitable and required, in this case mg/ml.
1 ppm = 1 mg/L
10 ppm = 10 mg/L
10 ppm = 1 mg/100 mL
2.497 * 1 mg/100 mL = 2.497 mg
If this were a demonstration calculation, I would mention the fact that these were average weights that are being used to avoid future confusion.
Artificial intelligence and humans both have their own unique strengths and weaknesses when it comes to answering analytical chemistry questions.
AI excels at tasks that involve data processing and analysis, such as pattern recognition and prediction and it can analyse vast amounts of data quickly and accurately and can detect subtle patterns that may be missed by human analysts.
However, AI still has limitations, such as difficulty in dealing with complex and unexpected situations that require creativity and adaptability with humans better at tasks that require critical thinking, problem-solving, and decision-making skills.
This can be seen in the answers to the questions we put to it with simple calculations giving correct answers but anything that approaches opinion leads to either wrong or over-reaching conclusions.
Going forward it might be best to combine the strengths of AI and human expertise in analytical chemistry, with the human still required to “sense-check” the output. So, we experts still have a place for now.