全球地震数据集

美国政府维护着一系列来自最近地震事件的地震相关数据的实时馈送。您也可以选择检查过去一小时到过去三十天的数据。您可以选择检查具有各种大小的事件的数据。对于该项目,我们将使用包含过去七天内的所有地震事件的数据集,其具有1.0或更大的量值。

您还可以选择各种格式。在第一个示例中,我们将讨论如何以csv格式(逗号分隔值)解析文件。有更方便的格式来处理如“json”,但并不是所有的数据集都整齐地组织起来。我们将开始解析一个csv文件,然后可以看看如何使用json格式。

要在自己的系统上跟踪此项目,请转到USGS源的地震数据的csv文件,并下载“过去7天”标题下的文件“M1.0 +地震”。如果你喜欢,这里是一个直接链接到那个文件。此数据每5分钟更新一次,因此您的数据将不会完全符合您在此处看到的内容。格式应该匹配,但数据本身不匹配。

解析数据,如果我们检查数据集的文本文件的前几行,我们可以识别与我们最相关的信息:

time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,net,id,updated,place,type
2015-04-25T14:25:06.520Z,40.6044998,-121.8546677,17.65,1.92,md,7,172,0.0806,0.02,nc,nc72435025,2015-04-25T14:31:51.640Z,"12km NNE of Shingletown, California",earthquake
2015-04-25T14:21:16.420Z,37.6588326,-122.5056686,7.39,2.02,md,21,99,0.07232,0.04,nc,nc72435020,2015-04-25T14:26:05.693Z,"3km SSW of Broadmoor, California",earthquake
2015-04-25T14:14:40.000Z,62.6036,-147.6845,43.1,1.7,ml,,,,0.67,ak,ak11565061,2015-04-25T14:40:18.782Z,"108km NNE of Sutton-Alpine, Alaska",earthquake
2015-04-25T14:10:02.830Z,27.5843,85.6622,10,4.6,mb,,86,5.898,0.87,us,us20002965,2015-04-25T14:38:25.174Z,"14km E of Panaoti, Nepal",earthquake
2015-04-25T13:55:47.040Z,33.0888333,-116.0531667,5.354,1.09,ml,23,38,0.1954,0.26,ci,ci37151511,2015-04-25T13:59:57.204Z,"10km SE of Ocotillo Wells, California",earthquake

现在,我们只对每次地震的纬度和经度感兴趣。如果我们看第一行,看起来就像我们对每行的第二和第三列感兴趣。在保存程序文件的目录中,创建一个名为“datasets”的目录。将文本文件保存为“earthquake_data.csv”在此新目录中。

使用Python的csv模块来解析数据,我们将使用Python的csv模块模块处理数据,这简化了使用csv文件的过程。

以下代码生成两个列表,包含文件中每个地震的纬度和经度:

In [2]:
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
import os
import csv
fig_index = 0

# Open the earthquake data file.
filename = '/gdata/all_week.csv'

# Create empty lists for the latitudes and longitudes.
lats, lons = [], []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))

我们创建一个空列表来包含纬度和经度。然后我们使用with语句确保文件一旦被读取就关闭,即使在处理文件时有错误。

打开数据文件,我们初始化一个csv reader对象。使用next()函数跳过标题行。然后我们循环数据文件中的每一行,并拉出我们想要的信息。

绘制地震

使用我们所了解的绘制一组点,我们现在可以对这些点进行简单的绘图:

In [3]:
import csv


# Create empty lists for the latitudes and longitudes.
lats, lons = [], []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))

# --- Build Map ---
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

x,y = eq_map(lons, lats)
eq_map.plot(x, y, 'ro', markersize=6)

plt.show()

在大约40行代码中,我们把一个巨大的文本文件变成了一个信息图。但是应该有一个相当明显的改进,我们应该让地图上的点表示每个地震的幅度。我们首先将每个地震的纬度和经度读取到一个列表中:

In [4]:
# Create empty lists for the data we are interested in.
lats, lons = [], []
magnitudes = []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))
        magnitudes.append(float(row[4]))

# --- Build Map ---
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

x,y = eq_map(lons, lats)
eq_map.plot(x, y, 'ro', markersize=6)

plt.show()

现在,不是一次绘制所有的点,而是循环遍历点。当我们绘制每个点,我们将根据大小调整点的大小。由于幅度从1.0开始,我们可以简单地使用幅度作为比例因子。为了得到标记的大小,我们只是乘以我们想要在我们的地图上的最小点的幅度:

In [5]:
%matplotlib inline
import csv

# Open the earthquake data file.
# filename = 'datasets/earthquake_data.csv'

# Create empty lists for the data we are interested in.
lats, lons = [], []
magnitudes = []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))
        magnitudes.append(float(row[4]))

# --- Build Map ---
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

min_marker_size = 2.5
for lon, lat, mag in zip(lons, lats, magnitudes):
    x,y = eq_map(lon, lat)
    msize = mag * min_marker_size
    eq_map.plot(x, y, 'ro', markersize=msize)

plt.show()

我们定义最小标记大小,然后循环遍历所有数据点,通过将地震的幅度乘以最小标记大小来计算每个点的标记大小。

如果你以前没有使用过zip()函数,它则需要一些列表,并从每个列表中提取一个项目。在每次循环迭代,都有一个匹配的经度,纬度和每个地震的幅度。

添加颜色

我们还可以做一个更改,以生成更有意义的可视化。让我们使用一些不同的颜色来表示幅度。我们使小地震为绿色,中等地震为黄色,大地震为红色。以下版本包括一个功能,用于标识每个地震的适当颜色:

In [6]:
# Open the earthquake data file.
# 
# Create empty lists for the data we are interested in.
lats, lons = [], []
magnitudes = []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))
        magnitudes.append(float(row[4]))

# --- Build Map ---
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

def get_marker_color(magnitude):
    # Returns green for small earthquakes, yellow for moderate
    #  earthquakes, and red for significant earthquakes.
    if magnitude < 3.0:
        return ('go')
    elif magnitude < 5.0:
        return ('yo')
    else:
        return ('ro')

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

min_marker_size = 2.5
for lon, lat, mag in zip(lons, lats, magnitudes):
    x,y = eq_map(lon, lat)
    msize = mag * min_marker_size
    marker_string = get_marker_color(mag)
    eq_map.plot(x, y, marker_string, markersize=msize)

plt.show()

现在我们可以很容易地看到重要的地震发生在哪里。

添加标题

在我们完成之前,让我们在地图上添加一个标题。我们的标题需要包括这些地震的日期范围,这需要我们在解析原始文本时提取更多的数据。我们将使用第一次和最后一次地震的日期作为标题。由于文件首先包括最近的地震,我们需要使用最后一个项目作为开始日期:

In [7]:
# Open the earthquake data file.

# Create empty lists for the data we are interested in.
lats, lons = [], []
magnitudes = []
timestrings = []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))
        magnitudes.append(float(row[4]))
        timestrings.append(row[0])

# --- Build Map ---
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

def get_marker_color(magnitude):
    # Returns green for small earthquakes, yellow for moderate
    #  earthquakes, and red for significant earthquakes.
    if magnitude < 3.0:
        return ('go')
    elif magnitude < 5.0:
        return ('yo')
    else:
        return ('ro')

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

min_marker_size = 2.5
for lon, lat, mag in zip(lons, lats, magnitudes):
    x,y = eq_map(lon, lat)
    msize = mag * min_marker_size
    marker_string = get_marker_color(mag)
    eq_map.plot(x, y, marker_string, markersize=msize)

title_string = "Earthquakes of Magnitude 1.0 or Greater\n"
title_string += "%s through %s" % (timestrings[-1], timestrings[0])
plt.title(title_string)

plt.show()

目前地图显示正常,但时区格式使得标题有点难以阅读。让我们使用日期,忽略标题中的时间。我们可以通过使用slice:timestring [:10]来保持每个时间戳的前10个字符。由于这是这个项目的最后一次迭代,让我们也使绘图大小变更大:

In [8]:
-
  File "<ipython-input-8-e5a42c761828>", line 2
    -
     ^
SyntaxError: invalid syntax

要了解Basemap库的灵活性,取决于查看您可以多快地更改此地图的外观和感觉。注释大洲颜色的行,要替换它对bluemarble的调用。如果您在自己的系统上制作此地图,也可以调整min_marker_size:

In [ ]:
lats, lons = [], []
magnitudes = []
timestrings = []

# Read through the entire file, skip the first line,
#  and pull out just the lats and lons.
with open(filename) as f:
    # Create a csv reader object.
    reader = csv.reader(f)

    # Ignore the header row.
    next(reader)

    # Store the latitudes and longitudes in the appropriate lists.
    for row in reader:
        lats.append(float(row[1]))
        lons.append(float(row[2]))
        magnitudes.append(float(row[4]))
        timestrings.append(row[0])



def get_marker_color(magnitude):
    # Returns green for small earthquakes, yellow for moderate
    #  earthquakes, and red for significant earthquakes.
    if magnitude < 3.0:
        return ('go')
    elif magnitude < 5.0:
        return ('yo')
    else:
        return ('ro')

eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
              lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
#eq_map.fillcontinents(color = 'gray')
eq_map.bluemarble()
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))

min_marker_size = 2.25
for lon, lat, mag in zip(lons, lats, magnitudes):
    x,y = eq_map(lon, lat)
    msize = mag * min_marker_size
    marker_string = get_marker_color(mag)
    eq_map.plot(x, y, marker_string, markersize=msize)

title_string = "Earthquakes of Magnitude 1.0 or Greater\n"
title_string += "%s through %s" % (timestrings[-1][:10], timestrings[0][:10])
plt.title(title_string)

plt.show()

我们可以继续完善这张地图。例如,我们可以从URL中读取数据,而不是从下载的文本文件。这样地图就永远是最新的。我们可以在绿色,黄色和红色的连续尺度上确定每个点的颜色。现在你有更好的感觉如何使用底图,我希望你喜欢玩这些进一步的细化,当你映射你最感兴趣的数据集。