Workshop on Data Mining and Data Fusion
Vienna International Centre
Conference Room II, Building C07, Vienna, Austria
15-16 September 2008
• Data Fusion and Data Mining Experts
• Topic Coordinators and Integrators of CTBT-ISS Project
• PTS staff members
The Preparatory Commission of the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) is arranging a two day workshop on data fusion and data mining. The purpose is to address how data fusion can be applied to observations from different monitoring networks and to study the possible use of data mining to improve on the analyses of verification data. The meeting will be held 15-16 September 2008 at the Vienna International Center in Vienna, Austria. The tentative agenda of the meeting is annexed.
“Data fusion” applied to CTBT verification data
The concept of Data fusion attempts to address how to make joint interpretation of the data from different technologies. Observations made by different monitoring technologies can support each other in the analysis of a particular event. The absence of detections can also play an essential role when analyzing an event.
The following data and information fusion challenges are of prime interest to the PTS:
- Data fusion between Seismic and Infrasound detections. A large atmospheric explosion may cause shockwaves that are detectable also by seismic stations. Sometimes, underground events such as large explosions may cause infrasound signals. Volcanic eruptions are also sources of seismic and infrasound signals.
- Data fusion between Seismic and Hydroacoustic observations; it is well known that hydroacoustic stations can record signals from seismic events below or close to the ocean. The seismic and hydroacoustic data can then be used jointly to locate such events and to facilitate their interpretation.
- The sources of radionuclide observations have to be determined through atmospheric transport modeling (ATM). Taking into account the actual movements of the atmosphere, it is possible to “backtrack” the path the radionuclide material has taken. This is an important fusion of radionuclide observations and data and information on the atmosphere.
- Source locations estimated from “backtracking” of radionuclide observations through atmospheric modeling contain very large uncertainties. Using source locations estimated from observations by any of the three waveform technologies as potential sources of radionuclide release, it is possible to explain the radionuclide observations with a much higher degree of confidence. If the location of the waveform event is taken as a release point, ATM “forward modeling” may provide estimate where the radionuclide detections could potentially be seen.
- States may use information collected by national technical means in combination with data and information from the CTBT verification network.
- Data Fusion and On-Site Inspection (OSI); during an OSI a large number of technical measurements might be conducted and these data and information have to be fused with human observations made by the inspectors in the same limited area. This data and information fusion, that has to be made under severe time pressure, as a basis for decision how to progress with the inspection, is a great challenge.
The workshop should address current practices on data fusion in the scientific community and how they can be applied to enhance the data analysis at the PTS.
“Data mining” and analysis of CTBT verification data
Even if the existing procedures for data analysis at IDC have proven capable to meet the present needs, there is a constant need for the PTS to stay tuned to the scientific development and to gradually improve its methods and procedures. There is a growing pressure to establish more efficient automatic analysis procedures that will provide improved results and thus reduce the workload on the human interactive analyses. This is essential to cope with a growing flow of data as more and more stations are being implemented without sacrifice on the quality of the products.
Many of these issues relate to the dramatic developments in information analysis taking place at many scientific institutions around the globe. The purpose of this workshop is to explore how these new scientific developments may improve on analysis procedures and methods at the PTS. The aim is not only to achieve a more cost effective analysis process, but also improve on the quality of the products to be delivered by the CTBTO.
“Data mining” applicable at different levels
“Data mining” methods and procedures might be applied to the analysis of CTBT verification data in several ways and at different levels in the IMS/IDC data processing:
At station level; to enhance the detection of new signals using earlier detected signals as a reference. Seismic signals from one and the same source region tend to have similar characteristics when observed at a particular station. Recordings from one and the same event at different stations may on the other hand differ substantially due to difference in wave propagation. Hydroacoustic signals show similar characteristics, whereas infrasound signals differ very much over time also for a given source – receiver situation. At some stations this reference event approach has been successfully tested for seismic signals from repeated sources such as mines, using cross correlation techniques.
IDC Routine Analysis; A key element in the routine data analysis procedure is to identify which of the detected signals originate from one and the same event. This association process is integrated into an iterative localization process. To compare the characteristics, such as the waveforms, of the detected signals with those available in the large data base of signals from earlier observed and located events is likely to improve on the speed and the accuracy of the association process.
Description of the events - “Screening”; to facilitate the interpretation of the events observed, PTS is providing States a number of parameters characterizing the observations made. Further developments of the way an observed event is characterized would be most helpful to the States in their assessment of the observed events. To characterize a new event in relation to earlier observed nearby events using data available in the PTS data base might improve on this characterization.
An Experimental Environment
To study how these new scientific developments might be applied to the analysis of CTBT verification data, an experimental environment might be useful. Such an environment should be open to the scientific community. Scientist could bring their methods and procedures to this experimental system and test them on realistic data. The environment might also be a meeting place for experts from “data mining” to promote further cooperation among experts and facilitate further development of procedures and methods. The concept of such an environment will be discussed in the workshop.
States Signatories willing to nominate experts and to contribute to achievement of the above objectives of the workshop are requested to inform the Evaluation Section of the PTS at the address below. These communications should include the names of experts, their current position, area of expertise and topic(s) of presentation, and be forwarded to the PTS before 25 July 2008.
Please return this form before 17 August 2008 to the PTS contact below.
9.30 - 10.00 Welcome remarks and introduction
10:00 - 10:45 Invited presentation on data fusion
11:00 – 11:30 Presentation on current data fusion between the waveforms at the PTS
11:30 – 12:00 Presentation on current data fusion between the radionuclide detections and waveforms at the PTS
12:00 – 12:30 Presentations of other data fusion experiences
14:30 – 16:30 Discussion on data fusion principles for treaty verification point of view
16:45 –17:45 Discussion (cont)
Conclusions of Data fusion day
9.30 – 10.00 Introduction to the Data Mining
Invited presentation on data mining
10:00 – 10:45 Possible perspectives on data mining for CTBTO
11:00 – 12:30 Invited presentations
Waveform phase detection and identification
14:30 – 16:30 Invited presentations - Event categorization
16:45 –17:45 Experimental data mining environment
Conclusions of Data Mining day
Participants must make their own hotel reservations directly with the hotel of their choice.
Participants are responsible for obtaining their visas to Austria. They should apply for their visa, if needed, well in advance of their departure. No visa can be delivered upon arrival.
Health and Accident Insurance:
Prior to their departure, participants should have a health and accident insurance for the duration of the workshop.
All costs related to the attendance of the workshop (travel costs, daily subsistence allowance, health insurance, etc.) shall be borne by the participants.
The PTS would explore opportunities for providing limited financial assistance to some experts coming from countries with low and lower middle income in accordance with World Development Indicators database of World Bank, subject to the availability of funds.
The Organizing Committee may limit the number of presentations to allow more active exchange of views among participants and to contribute to achievement of the objectives of the workshop. Relevant administrative information will be forwarded to registered participants in due course.
Point of Contact:
Ms Jennifer YLO
Software Applications Section
International Data Centre
CTBTO, Vienna International Centre
PO Box 1200, A-1400, Vienna, Austria
Fax: +43 1 26030 5874
Tel.: +43 1 26030 6135
Please return this form before 17 August 2008 to the above mentioned contact.