nb4. Query by scientific category |
The relevant columns in the ALMA TAP service are:
The scientific categories of observations are:
Note: these categories are stable in time and valid for all ALMA Cycles and may thus slighly differ from the categories in the ALMA Observing Tool which can change from Cycle to Cycle.
import numpy as np
from astropy.table import Table
import pyvo
import sys
import matplotlib.pyplot as plt
import pandas as pd
import sklearn.cluster
service = pyvo.dal.TAPService("https://almascience.eso.org/tap") # for the EU ALMA TAP service
# service = pyvo.dal.TAPService("https://almascience.nao.ac.jp/tap") # for the EA ALMA TAP service
# service = pyvo.dal.TAPService("https://almascience.nrao.edu/tap") # for the NA ALMA TAP service
def query_scientific_category(service, scientific_category):
"""Query for all science observations of a given scientific category. To reduce the memory
requirements, rather than using *, it is often useful to select the columns of interest.
service pyvo TAPService instance
scientific_category one of the categories shown at the top of this notebook
returns pandas table
"""
query = f"""
SELECT target_name, science_keyword, s_ra, s_dec, band_list
FROM ivoa.obscore
WHERE scientific_category = '{scientific_category}'
AND science_observation = 'T'
GROUP BY target_name, science_keyword, s_ra, s_dec, band_list
"""
return service.search(query).to_table().to_pandas()
def query_science_keyword(service, science_keyword):
"""ALMA has a long list of scientific keywords in the Observing Tool from whch PIs need to select
one or two in their proposals. This query returns all science observations for a given science
keyword (or part of it).
service pyvo TAPService instance
science_keyword one of the science keywords of ALMA or a substring
returns pandas table
"""
query = f"""
SELECT s_ra, s_dec, target_name, band_list, t_exptime
FROM ivoa.obscore
WHERE science_observation = 'T'
AND science_keyword like '%{science_keyword}%'
GROUP BY s_ra, s_dec, target_name, band_list, t_exptime
"""
return service.search(query).to_table().to_pandas()
def query_science_keyword_datatype(service, science_keyword, datatype):
"""This function shows how to combine several constraints. Here the science keyword as well
as the datatype and science_observations
service pyvo TAPService instance
science_keyword one of the science keywords of ALMA or a substring
datatype "image" or "cube"
returns pandas table
"""
query = f"""
SELECT s_ra, s_dec, target_name, band_list, t_exptime
FROM ivoa.obscore
WHERE science_keyword like '%{science_keyword}%'
AND science_observation = 'T'
AND dataproduct_type = '{datatype}'
GROUP BY s_ra, s_dec, target_name, band_list, t_exptime
"""
return service.search(query).to_table().to_pandas()
These queries only return science observations, not calibrator observations.
output_agn = query_scientific_category(service, 'Active galaxies')
print(f"There are {len(np.unique(output_agn['target_name']))} unique target names within the list of {len(output_agn)} results.")
There are 9865 unique target names within the list of 12844 results.
Showing the first 30 unique target names:
np.set_printoptions(threshold=sys.maxsize)
np.array(np.unique(output_agn['target_name']))[0:30]
array(['0-10000', '0-10510', '0-12043', '0-12407', '0-13299', '0-13375', '0-1426', '0-1437', '0-16822', '0-17244', '0-17749', '0-18038', '0-18180', '0-19883', '0-22825', '0-2318', '0-23382', '0-23626', '0-24625', '0-24636', '0-26339', '0-34302', '0-34622', '0-34897', '0-3662', '0-3753', '0-3973', '0-4356', '0-4503', '0-4936'], dtype=object)
In which bands have the AGN (as determined by the science category) been observed?
plt.rcParams["figure.figsize"] = (10,5)
output_agn['band_list'].hist(bins = 16)
<AxesSubplot:>
Where have they been observed?
plt.rcParams["figure.figsize"] = (20,15)
output_agn.plot(x='s_ra',y='s_dec', linestyle='', ms=7, marker='o', alpha=0.03, label='AGN (science category) observed with ALMA')
plt.xlabel('RA')
plt.ylabel('Dec')
Text(0, 0.5, 'Dec')
Within this science category there are several combinations of science keywords. We split each combination of science keywords up and then plot the pie chart.
flattened_science_keywords = [item.strip() for sublist in [keyword.split(',') for keyword in output_agn['science_keyword']] for item in sublist]
pd.DataFrame({'science_keyword':flattened_science_keywords})['science_keyword'].value_counts().plot.pie()
pd.DataFrame({'science_keyword':flattened_science_keywords})['science_keyword'].value_counts()
Starburst galaxies 7643 Sub-mm Galaxies (SMG) 5877 Active Galactic Nuclei (AGN)/Quasars (QSO) 1993 Starbursts 1707 star formation 1707 Galaxy structure & evolution 1368 High-z Active Galactic Nuclei (AGN) 993 jets 961 Outflows 961 feedback 961 Galactic centres/nuclei 852 Surveys of galaxies 349 Luminous and Ultra-Luminous Infra-Red Galaxies (LIRG & ULIRG) 321 Giant Molecular Clouds (GMC) properties 294 Merging and interacting galaxies 293 Galaxy groups and clusters 199 Spiral galaxies 169 Gravitational lenses 112 Galaxy Clusters 87 Galaxy chemistry 69 Dwarf/metal-poor galaxies 47 Early-type galaxies 46 Gamma Ray Bursts (GRB) 20 Damped Lyman Alpha (DLA) systems 4 Cosmic Microwave Background (CMB)/Sunyaev-Zel'dovich Effect (SZE) 2 Name: science_keyword, dtype: int64
Show the list of all Science Keywords in the the ALMA database:
query = f"""SELECT DISTINCT(science_keyword) from ivoa.obscore"""
flattened_science_keywords = sorted(list(set([item.strip() for sublist in [keyword.split(',') for keyword in service.search(query).to_table().to_pandas()['(science_keyword)']] for item in sublist])))
print(*flattened_science_keywords, sep="\n")
Active Galactic Nuclei (AGN)/Quasars (QSO) Astrochemistry Asymptotic Giant Branch (AGB) stars Black holes Brown dwarfs Cataclysmic stars Cosmic Microwave Background (CMB)/Sunyaev-Zel'dovich Effect (SZE) Damped Lyman Alpha (DLA) systems Debris disks Disks around high-mass stars Disks around low-mass stars Dwarf/metal-poor galaxies Early-type galaxies Evolved stars - Chemistry Evolved stars - Shaping/physical structure Evolved stars: Shaping/physical structure Exo-planets Exoplanets Galactic centres/nuclei Galaxy Clusters Galaxy chemistry Galaxy groups and clusters Galaxy structure & evolution Galaxy structure &evolution Gamma Ray Bursts (GRB) Giant Molecular Clouds (GMC) properties Gravitational lenses HII regions High-mass star formation High-z Active Galactic Nuclei (AGN) Hypergiants Infra-Red Dark Clouds (IRDC) Inter-Stellar Medium (ISM)/Molecular clouds Intermediate-mass star formation Low-mass star formation Luminous Blue Variables (LBV) Luminous and Ultra-Luminous Infra-Red Galaxies (LIRG & ULIRG) Lyman Alpha Emitters/Blobs (LAE/LAB) Lyman Break Galaxies (LBG) Magellanic Clouds Main sequence stars Merging and interacting galaxies Outflows Photon-Dominated Regions (PDR)/X-Ray Dominated Regions (XDR) Post-AGB stars Pre-stellar cores Pulsars and neutron stars Solar system - Asteroids Solar system - Comets Solar system - Planetary atmospheres Solar system - Planetary surfaces Solar system - Trans-Neptunian Objects (TNOs) Spiral galaxies Starburst galaxies Starbursts Sub-mm Galaxies (SMG) Supernovae (SN) ejecta Surveys of galaxies The Sun Transients White dwarfs feedback jets jets and ionized winds star formation
First, we investigate this question using the corresponding scientific keyword.
output_smgs_scikey = query_science_keyword(service, 'Sub-mm Galaxies (SMG)')
print(f"There are {len(output_smgs_scikey)} observations with keyword 'Sub-mm Galaxies (SMG)'.")
There are 10649 observations with keyword 'Sub-mm Galaxies (SMG)'.
Similarly, we can look for sub-millimeter Galaxies in the proposal abstract texts. For example with
query = """
SELECT s_ra, s_dec, target_name, t_exptime
FROM ivoa.obscore
WHERE proposal_abstract like '%SMG%'
OR proposal_abstract like '%ub-mm galax%'
AND science_observation = 'T'
GROUP BY s_ra, s_dec, target_name, t_exptime
"""
output_smgs_abstract = service.search(query).to_table().to_pandas()
print(f"There are {len(output_smgs_abstract)} observations where the proposal abstract mentions strings related to SMGs.")
There are 6035 observations where the proposal abstract mentions strings related to SMGs.
We can now plot the position of these sub-mm galaxies observations:
plt.rcParams["figure.figsize"] = (20,15)
ax = output_smgs_scikey.plot(x='s_ra',y='s_dec', linestyle='', ms=5, marker='o', label='ALMA observed SMGs', alpha=0.02)
output_smgs_abstract.plot(x='s_ra',y='s_dec', linestyle='', ms=5, marker='o', label='ALMA observed SMGs', alpha=0.02, ax=ax)
<AxesSubplot:xlabel='s_ra'>
To answer this question, we first group the observations together:
# link recursively all observations together where the centre distance of any observation pair is less than 30 arcsec.
eps = 30/60/60*3.1415927/180
cluster = sklearn.cluster.DBSCAN(eps=eps, min_samples=3, algorithm='ball_tree', metric='haversine').fit(np.radians(output_smgs_scikey[['s_ra', 's_dec']].to_numpy()))
output_smgs_scikey['galaxylabels'] = cluster.labels_
We then sum the exposure times, sort by them and print out a summary of the groups of observations identified. The entry 0 of that list contains all the observations that have no overlap with other observations and that group thus has to be skipped.
output_clustered_and_sorted = output_smgs_scikey.groupby(['galaxylabels'])['t_exptime'].agg('sum').reset_index().sort_values('t_exptime', ascending=False)
plotarray = []
for i, row in output_clustered_and_sorted.iloc[1:51].iterrows():
galaxylabel = int(row['galaxylabels'])
galaxygroup = output_smgs_scikey.loc[(output_smgs_scikey['galaxylabels']==galaxylabel)]
plotarray.append([row['t_exptime']/3600, galaxygroup['target_name'].values[0]])
print(f"Group number {galaxylabel} with ")
print(f" - {len(galaxygroup['s_ra'])} observations \n - total exposure time {row['t_exptime']/3600:.1f} hours")
print(f" - average ra={galaxygroup['s_ra'].mean():.4f} and dec={galaxygroup['s_dec'].mean():.4f} ")
print(f" - has the unique source names: {', '.join(list(set(galaxygroup['target_name'].values)))}\n")
Group number 548 with - 8 observations - total exposure time 21.5 hours - average ra=135.7983 and dec=0.6518 - has the unique source names: ID81, SDP81, SDP.81, SDP_81 Group number 0 with - 13 observations - total exposure time 19.1 hours - average ra=181.3462 and dec=-7.7088 - has the unique source names: BR1202-0725, BR1202 Group number 90 with - 16 observations - total exposure time 18.5 hours - average ra=47.8863 and dec=-58.3924 - has the unique source names: SPT0311-58 Group number 469 with - 14 observations - total exposure time 16.4 hours - average ra=149.9286 and dec=2.4940 - has the unique source names: AzTEC1, AzTEC-1, COS850.0023, S2COSMOS.850.6, AzTECC5, AS2COS0023.1 Group number 289 with - 13 observations - total exposure time 16.1 hours - average ra=149.9949 and dec=2.5805 - has the unique source names: AzTEC8, AzTECC2a, AzTECC2b, AzTECC2, AzTEC_8, S2COS.0328, AS2COS0028.1 Group number 465 with - 18 observations - total exposure time 13.1 hours - average ra=357.4280 and dec=-56.6380 - has the unique source names: spt2349, spt2349-56, SPT2349-56, SPT_2349-56 Group number 115 with - 6 observations - total exposure time 12.7 hours - average ra=10.5985 and dec=-33.7266 - has the unique source names: HATLAS_RED_293, GRH_north, SGP-UR-54092, SGP54092, GRH_south, SGP-54092 Group number 419 with - 17 observations - total exposure time 11.1 hours - average ra=56.6714 and dec=-52.0840 - has the unique source names: SPT0346-52, SPT_0346-52 Group number 214 with - 15 observations - total exposure time 10.7 hours - average ra=53.1066 and dec=-27.8702 - has the unique source names: ALESS045.1, GSp_06, scuba2-055, AGS11, LESSJ033225.7-275228, basic19+8, GSp_14, GSp_12, scuba2-14, LESS45 Group number 472 with - 10 observations - total exposure time 10.3 hours - average ra=53.1441 and dec=-27.8725 - has the unique source names: scuba2-072, basic17+73, GSp_07, scuba2-25, scuba2-54 Group number 206 with - 9 observations - total exposure time 9.5 hours - average ra=150.1094 and dec=2.2570 - has the unique source names: 627356, ID85001929, csm1, MAMBO-9, S2COSMOS.850.50 Group number 3 with - 11 observations - total exposure time 9.4 hours - average ra=53.1221 and dec=-27.9388 - has the unique source names: XID_403, ALESS73.1, LESS J033229.4-275619, ALESS_73.1, LESS73, LESSJ033229.3-275619, ALESS073.1 Group number 37 with - 12 observations - total exposure time 9.2 hours - average ra=150.0873 and dec=2.5888 - has the unique source names: COS850.0059, LBG-1, S2COSMOS.850.40, aztec3, Aztec-3, aztec3-pcluster, AzTEC-3, Alpine_8 Group number 238 with - 14 observations - total exposure time 9.0 hours - average ra=216.0582 and dec=2.3843 - has the unique source names: HATLAS_RED_2277, GAMA15-1, J142413.9+022304, ID15-141, G15v2.779, ID141 Group number 358 with - 8 observations - total exposure time 9.0 hours - average ra=323.7980 and dec=-1.0478 - has the unique source names: Eyelash, Name_Eyelash, SMMJ21352-0102 Group number 23 with - 10 observations - total exposure time 8.7 hours - average ra=64.6648 and dec=-47.8644 - has the unique source names: SPT_0418-47, SPT-0418, SPT0418-47 Group number 508 with - 9 observations - total exposure time 8.5 hours - average ra=150.0621 and dec=2.3790 - has the unique source names: Tune147, ZF-COSMOS-13172, Hyde, ZF-COSMOS-13414, COSMOS-32409 Group number 145 with - 11 observations - total exposure time 8.4 hours - average ra=0.7795 and dec=-33.0473 - has the unique source names: sgp38, SGP38326, HATLAS_RED_28, SGP38, SGP-38326 Group number 200 with - 11 observations - total exposure time 8.2 hours - average ra=334.3845 and dec=0.2954 - has the unique source names: ADF22.A1, AzTEC1, SSA22-AZ001, ADF22.A5, ADF22.A7, SSA22-AzTEC1, SSA.0001, DSFG1 Group number 344 with - 44 observations - total exposure time 7.3 hours - average ra=3.5819 and dec=-30.3823 - has the unique source names: A2744_b4_p4, A2744_b4_p5, A2744_b4_p2, A2744_b3_p4, A2744_p8, Abell_2744, A2744_p11, A2744_b3_p3, A2744_p12, A2744_p10, A2744_b3_p2, A2744_b4_p3, A2744_p13 Group number 150 with - 17 observations - total exposure time 7.0 hours - average ra=57.1760 and dec=-62.3469 - has the unique source names: 0348-L5, spt0348, SPT0348-62, SPT0348 Group number 541 with - 11 observations - total exposure time 6.6 hours - average ra=135.7625 and dec=-1.6908 - has the unique source names: H-ATLAS_J090302.9-014127, SDP17, SDP17b, sdp17b, SDP.17B, J090302.9-014127 Group number 285 with - 5 observations - total exposure time 6.3 hours - average ra=88.3706 and dec=-33.7010 - has the unique source names: MACSJ0553.4-3342, MJ0553-ID19 Group number 542 with - 8 observations - total exposure time 6.3 hours - average ra=84.5704 and dec=-50.5143 - has the unique source names: SPT-S_J053816-5030.8, SPT0538-50 Group number 97 with - 7 observations - total exposure time 6.2 hours - average ra=334.3056 and dec=0.4476 - has the unique source names: ASA26.1, SSA22-AzTEC26, SSA22-AZ043 Group number 720 with - 3 observations - total exposure time 6.1 hours - average ra=356.5393 and dec=12.8220 - has the unique source names: BX610 Group number 688 with - 5 observations - total exposure time 6.0 hours - average ra=135.1906 and dec=0.6899 - has the unique source names: G09_83808, GAMA09-100954, G09-83808, HATLAS_J090045 Group number 440 with - 11 observations - total exposure time 5.8 hours - average ra=132.3891 and dec=2.2452 - has the unique source names: G09v1.124, J084933.4+021443, HATLAS_J084933.4 Group number 370 with - 8 observations - total exposure time 5.7 hours - average ra=53.1622 and dec=-27.7852 - has the unique source names: UDF3, HUDF-JVLA-ALMA, scuba2-099, ASAGAO35, GOODS-S, ASAGAO45 Group number 180 with - 13 observations - total exposure time 5.7 hours - average ra=41.4337 and dec=-63.3442 - has the unique source names: SPT0245-63 Group number 389 with - 10 observations - total exposure time 5.6 hours - average ra=349.8411 and dec=-55.9661 - has the unique source names: SPT_2319-55, SPT2319-55 Group number 516 with - 6 observations - total exposure time 5.4 hours - average ra=150.3752 and dec=2.0376 - has the unique source names: S2COS.0762, AzTECC129 Group number 117 with - 7 observations - total exposure time 5.3 hours - average ra=312.0956 and dec=-55.3448 - has the unique source names: SPT2048-55 Group number 213 with - 4 observations - total exposure time 5.2 hours - average ra=104.6555 and dec=-55.9518 - has the unique source names: SMMJ0658 Group number 529 with - 6 observations - total exposure time 5.2 hours - average ra=34.4318 and dec=-5.2400 - has the unique source names: UDS-43941, SXDF-ALMA3 Group number 515 with - 4 observations - total exposure time 5.1 hours - average ra=44.4210 and dec=-22.1501 - has the unique source names: MACS0257.6-2209 Group number 255 with - 3 observations - total exposure time 5.1 hours - average ra=150.0721 and dec=2.4544 - has the unique source names: alma2mm.9 Group number 6 with - 11 observations - total exposure time 5.1 hours - average ra=83.2125 and dec=-50.7855 - has the unique source names: SPT_0532-50, SPT-0532, SPT0532-50 Group number 111 with - 9 observations - total exposure time 5.0 hours - average ra=64.3929 and dec=-11.9124 - has the unique source names: MJ0417-ID2, MACSJ0417.5-1154, MJ0417-ID234 Group number 368 with - 7 observations - total exposure time 4.9 hours - average ra=35.0694 and dec=-6.0289 - has the unique source names: HXMM01 Group number 179 with - 11 observations - total exposure time 4.7 hours - average ra=358.4138 and dec=-50.1687 - has the unique source names: SPT_2353-50, SPT2353-50 Group number 193 with - 11 observations - total exposure time 4.6 hours - average ra=357.9620 and dec=-57.3715 - has the unique source names: SPT2351-57, SPT_2351-57 Group number 39 with - 17 observations - total exposure time 4.4 hours - average ra=53.2089 and dec=-27.5271 - has the unique source names: ALESS112.1, LESSJ033249.3-273112, ALESS_112.1, ALMA_3mm_ID13_ID14, ALESS87, LESS112, ALESS87.3, ALESS087.1, LESSJ033251.1-273143, ALMA_3mm_ID13 Group number 432 with - 7 observations - total exposure time 4.4 hours - average ra=6.9056 and dec=-2.1329 - has the unique source names: HELMS_65, helms65, HeLMS-65 Group number 475 with - 11 observations - total exposure time 4.2 hours - average ra=69.2374 and dec=-54.6361 - has the unique source names: ADFS27, ADFS-27, ADFS_27 Group number 103 with - 8 observations - total exposure time 4.1 hours - average ra=203.4062 and dec=24.2612 - has the unique source names: ur56917, UR56917, HATLAS_RED_1930, NGP-190387 Group number 365 with - 6 observations - total exposure time 4.0 hours - average ra=326.8298 and dec=-50.5986 - has the unique source names: SPT_2147-50, SPT2147-50 Group number 139 with - 4 observations - total exposure time 4.0 hours - average ra=316.8265 and dec=23.5227 - has the unique source names: 4C23.56 Group number 360 with - 11 observations - total exposure time 3.9 hours - average ra=40.7865 and dec=-49.2598 - has the unique source names: SPT0243-49, SPT_0243-49 Group number 22 with - 10 observations - total exposure time 3.9 hours - average ra=32.4218 and dec=0.2663 - has the unique source names: hers1, HERS1, PJ020941.3
We now plot the 50 SMGs with the longest total exposure times:
plt.bar(np.arange(len(plotarray)),np.array([p[0] for p in plotarray]), align='center', label='The ALMA SMGs with the largest total observing time')
for i in range(len(plotarray)):
plt.text(i, plotarray[i][0]+0.5 , plotarray[i][1], color='black', fontweight='bold', rotation=45 )
plt.ylim(0,25)
plt.ylabel('Total exposure time [hrs]')
plt.legend(fontsize=15)
<matplotlib.legend.Legend at 0x7f6516fe63c8>
output_qsos_cube = query_science_keyword_datatype(service, 'Quasars', 'cube' )
output_qsos_image = query_science_keyword_datatype(service, 'Quasars', 'image' )
print(f"There are {len(np.unique(output_qsos_cube['target_name']))} line observations (cube) and {len(np.unique(output_qsos_image['target_name']))} continuum observations (line) in the ALMA database.")
There are 681 line observations (cube) and 865 continuum observations (line) in the ALMA database.