Search and Summarize Images and Text
After installing the som tools:
git clone https://www.github.com/vsoch/som
cd som
python setup.py install
You can type the command som
to see it’s usage:
usage: som [-h] [--version] [--debug] {list,get} ...
Stanford Open Modules for Python [SOM]
optional arguments:
-h, --help show this help message and exit
--version show software version
--debug use verbose logging to debug.
actions:
actions for som tools
{list,get} google storage and datastore
list list collections, entities, images.
get download data from storge and datastore
Make sure your project credentials are exported:
GOOGLE_APPLICATION_CREDENTIALS=/top/secret/pizza.json
export GOOGLE_APPLICATION_CREDENTIALS
GOOGLE_APPLICATION_CREDENTIALS=/home/vanessa/.vanessasaur/som-irlearning.json
Let’s list to see a summary for a project:
som list --project som-irlearning
Collections: 1
Images: 933
Entity: 233
Now let’s look just at Collection entries
som list --project som-irlearning --collections
Collection: IRB33192
updated 2017-08-28 03:28:45.640733+00:00
created 2017-08-22 03:00:53.982025+00:00
uid IRB33192
Found 1 collections
Entities?
som list --project som-irlearning --entity
...
Entity: IR664a78
UPLOAD_AGENT STARR:SENDITClient
created 2017-08-27 20:20:11.107150+00:00
id IR664a78
uid IR664a78
updated 2017-08-27 20:20:11.312405+00:00
PatientSex F
Entity: IR664a82
uid IR664a82
UPLOAD_AGENT STARR:SENDITClient
PatientAge 067Y
PatientSex F
updated 2017-08-27 20:29:05.324285+00:00
created 2017-08-27 20:29:05.212657+00:00
id IR664a82
Found 233 entities
And images (each is a compressed set, and they have quite a lot of metadata):
som list --project som-irlearning --images
Image: Collection/IRB33192/Entity/IR664a82/IR664a82_20070117_IR664a84.tar.gz
InstitutionName Stanford Med. Center
DataCollectionDiameter 500.000000
ConversionType WSD
BitsAllocated 16
uid Collection/IRB33192/Entity/IR664a82/IR664a82_20070117_IR664a84.tar.gz
PositionReferenceIndicator SN
Modality CT
StationName SCT1_OC0
SoftwareVersions LightSpeedApps308I.2_H3.1M5
ContrastBolusRoute IV
PatientSex F
2FIR664a82_20070117_IR664a84.tar.gz?generation=1503865797682327&alt=media
ContrastBolusAgent OMNI350
IssuerOfPatientID STARR. In an effort to remove PHI all dates are offset from their original values.
TableHeight 105.099998
PhotometricInterpretation MONOCHROME2
RequestingService EMERGENCY DEPARTMENT *
KVP 120
PatientIdentityRemoved Yes
PixelPaddingValue -2000
HighBit 15
...
StudyDate 2007-01-17T00:00:00-0800
RescaleIntercept -1024
Found 933 images
Filter to a subset?
som list --project som-irlearning --images --filter Modality,=,CT