Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suspicious characters found in locality description of record c0fa5188-9440-4727-8cf2-3d963b7de039 https://mycoportal.org/portal/collections/individual/index.php?occid=6044645 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 #63

Open
jhpoelen opened this issue Jan 9, 2025 · 9 comments

Comments

@jhpoelen
Copy link
Member

jhpoelen commented Jan 9, 2025

when running

GIB_VERSION=hash://sha256/37bdd8ddb12df4ee02978ca59b695afd651f94398c0fe2e1f8b182849a876bb2
CONTENT_REPO=https://linker.bio

preston cat --remote $CONTENT_REPO $GIB_VERSION --no-cache\
 | elton stream --data-dir data --remote $CONTENT_REPO --no-cache

using elton 0.14.2, the stream process appears to stop on


https://en.wiktionary.org/wiki/support 813c0642-eb0a-41c4-846a-d42c6eddd14f ISC-F-0124638 f25dc2ec-d71d-4014-abb2-c3b64af25d79 ISC 168452 Diaporthe decipiens Fungi | Ascomycota | Sordariomycetes | Diaporthales | Diaporthaceae | Diaporthe | Diaporthe decipiens kingdom | phylum | class | order | family | genus | species http://purl.obolibrary.org/obo/RO_0002454 hasHost Carpinus PreservedSpecimen 1894-10-01T00:00:00Z bei Königstein https://mycoportal.org/portal/collections/individual/index.php?occid=6044642 https://mycoportal.org/portal/collections/individual/index.php?occid=6044642 local Iowa State University, Ada Hayden Herbarium. 2021-04-13. MyCoPortal - ccfac826-d31c-4265-b8bc-5af693c2ca48. hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 2025-01-09T20:08:22.754Z 0.14.2-SNAPSHOT
https://en.wiktionary.org/wiki/support 1b71bf22-066c-49a5-8556-531d9e0e0776 ISC-F-0124511 f25dc2ec-d71d-4014-abb2-c3b64af25d79 ISC 163818 Dacryopinax spathularia Fungi | Basidiomycota | Dacrymycetes | Dacrymycetales | Dacrymycetaceae | Dacryopinax | Dacryopinax spathularia kingdom | phylum | class | order | family | genus | species http://purl.obolibrary.org/obo/RO_0002220 adjacentTo dead limb lying on ground PreservedSpecimen 1956-03-18T00:00:00Z -17.71944444 -149.3155556 Taravao District. https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 local Iowa State University, Ada Hayden Herbarium. 2021-04-13. MyCoPortal - ccfac826-d31c-4265-b8bc-5af693c2ca48. hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 2025-01-09T20:08:22.754Z 0.14.2-SNAPSHOT
https://en.wiktionary.org/wiki/support c0fa5188-9440-4727-8cf2-3d963b7de039 ISC-F-0124477 f25dc2ec-d71d-4014-abb2-c3b64af25d79 ISC 163733 Dacrymyces rubiformis Fungi | Basidiomycota | Dacrymycetes | Dacrymycetales | Dacrymycetaceae | Dacrymyces | Dacrymyces rubiformis kingdom | phylum | class | order | family | genus | species http://purl.obolibrary.org/obo/RO_0002454 hasHost Pinus PreservedSpecimen 1935-11-19T00:00:00Z 60.0 17.0 Upland: Lena parish, -"'Üngeby skog

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

with "'Üngeby skog hinting to a possible character encoding issue.

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

attempting to reproduce using

echo '{ "format": "dwca", "citation": "hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184", "url": "hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184" }'\
| elton stream --remote https://linker.bio\
> interactions.tsv

interactions.zip

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

an alternate way to reproduce:

echo "hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184" |\
 preston dwc-stream --remote https://linker.bio\
 | grep "c0fa5188-9440-4727-8cf2-3d963b7de039"

produced:

{
  "http://www.w3.org/ns/prov#wasDerivedFrom": "line:zip:hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184!/occurrences.csv!/L38568",
  "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://rs.tdwg.org/dwc/terms/Occurrence",
  "http://rs.tdwg.org/dwc/text/id": "6044645",
  "http://rs.tdwg.org/dwc/terms/locationRemarks": null,
  "http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters": null,
  "http://rs.tdwg.org/dwc/terms/verbatimElevation": null,
  "http://rs.tdwg.org/dwc/terms/identifiedBy": "L.L. Kennedy",
  "http://rs.tdwg.org/dwc/terms/identificationQualifier": null,
  "http://purl.org/dc/elements/1.1/rights": "http://creativecommons.org/publicdomain/zero/1.0/",
  "http://rs.tdwg.org/dwc/terms/endDayOfYear": null,
  "http://rs.tdwg.org/dwc/terms/family": "Dacrymycetaceae",
  "http://rs.tdwg.org/dwc/terms/maximumDepthInMeters": null,
  "http://rs.tdwg.org/dwc/terms/occurrenceRemarks": "There are no mature basisia or spores but I think this is Dacrymyces palmatus (Schw.) Bres.; young state",
  "http://rs.tdwg.org/dwc/terms/locality": "Upland: Lena parish, �-\"'Üngeby skog�-�; 60 17",
  "http://rs.tdwg.org/dwc/terms/collectionID": "f25dc2ec-d71d-4014-abb2-c3b64af25d79",
  "http://rs.tdwg.org/dwc/terms/lifeStage": null,
  "http://rs.tdwg.org/dwc/terms/taxonID": "163733",
  "http://rs.tdwg.org/dwc/terms/taxonRemarks": null,
  "http://rs.tdwg.org/dwc/terms/scientificNameAuthorship": "(Fr.) Neuhoff",
  "http://rs.tdwg.org/dwc/terms/year": "1935",
  "http://rs.tdwg.org/dwc/terms/georeferenceVerificationStatus": null,
  "http://rs.tdwg.org/dwc/terms/georeferencedBy": null,
  "http://purl.org/dc/terms/language": null,
  "http://rs.tdwg.org/dwc/terms/georeferenceSources": null,
  "http://rs.tdwg.org/dwc/terms/month": "11",
  "http://rs.tdwg.org/dwc/terms/verbatimEventDate": "19.XI.1935",
  "http://rs.tdwg.org/dwc/terms/ownerInstitutionCode": null,
  "http://rs.tdwg.org/dwc/terms/institutionCode": "ISC",
  "http://rs.tdwg.org/dwc/terms/occurrenceID": "c0fa5188-9440-4727-8cf2-3d963b7de039",
  "http://purl.org/dc/terms/rightsHolder": "Iowa State University",
  "http://purl.org/dc/terms/accessRights": null,
  "http://rs.tdwg.org/dwc/terms/minimumDepthInMeters": null,
  "http://rs.tdwg.org/dwc/terms/county": "Lena",
  "http://rs.tdwg.org/dwc/terms/dateIdentified": "1956",
  "http://rs.tdwg.org/dwc/terms/verbatimDepth": null,
  "http://rs.tdwg.org/dwc/terms/stateProvince": "Uppland",
  "http://rs.tdwg.org/dwc/terms/genus": "Dacrymyces",
  "http://rs.tdwg.org/dwc/terms/eventDate": "1935-11-19",
  "http://rs.tdwg.org/dwc/terms/associatedOccurrences": null,
  "http://rs.tdwg.org/dwc/terms/individualCount": null,
  "http://rs.tdwg.org/dwc/terms/subgenus": null,
  "http://rs.tdwg.org/dwc/terms/georeferenceRemarks": null,
  "http://rs.tdwg.org/dwc/terms/kingdom": "Fungi",
  "http://rs.tdwg.org/dwc/terms/reproductiveCondition": null,
  "http://rs.tdwg.org/dwc/terms/sex": null,
  "http://symbiota.org/terms/recordEnteredBy": "Angela Yoon",
  "http://rs.tdwg.org/dwc/terms/phylum": "Basidiomycota",
  "http://rs.tdwg.org/dwc/terms/order": "Dacrymycetales",
  "http://rs.tdwg.org/dwc/terms/country": "Sweden",
  "http://rs.tdwg.org/dwc/terms/decimalLatitude": "60",
  "http://rs.tdwg.org/dwc/terms/recordNumber": "s.n.",
  "http://rs.tdwg.org/dwc/terms/higherClassification": "Fungi|Basidiomycota|Agaricomycotina|Dacrymycetes|Dacrymycetales|Dacrymycetaceae|Dacrymyces",
  "http://rs.tdwg.org/dwc/terms/habitat": "cut twigs and branches of host",
  "http://portal.idigbio.org/terms/recordId": "urn:uuid:c0fa5188-9440-4727-8cf2-3d963b7de039",
  "http://purl.org/dc/terms/references": "https://mycoportal.org/portal/collections/individual/index.php?occid=6044645",
  "http://rs.tdwg.org/dwc/terms/informationWithheld": null,
  "http://rs.tdwg.org/dwc/terms/specificEpithet": "rubiformis",
  "http://rs.tdwg.org/dwc/terms/disposition": null,
  "http://rs.tdwg.org/dwc/terms/scientificName": "Dacrymyces rubiformis",
  "http://rs.tdwg.org/dwc/terms/identificationReferences": null,
  "http://rs.tdwg.org/dwc/terms/class": "Dacrymycetes",
  "http://rs.tdwg.org/dwc/terms/associatedTaxa": "host: Pinus",
  "http://rs.tdwg.org/dwc/terms/identificationRemarks": null,
  "http://rs.tdwg.org/dwc/terms/recordedBy": "S. Lundell",
  "http://rs.tdwg.org/dwc/terms/verbatimTaxonRank": null,
  "http://rs.tdwg.org/dwc/terms/municipality": "Uppsala l",
  "http://rs.tdwg.org/dwc/terms/geodeticDatum": null,
  "http://rs.tdwg.org/dwc/terms/verbatimCoordinates": null,
  "http://rs.tdwg.org/dwc/terms/collectionCode": null,
  "http://rs.tdwg.org/dwc/terms/basisOfRecord": "PreservedSpecimen",
  "http://rs.tdwg.org/dwc/terms/preparations": null,
  "http://rs.tdwg.org/dwc/terms/catalogNumber": "ISC-F-0124477",
  "http://rs.tdwg.org/dwc/terms/startDayOfYear": "323",
  "http://rs.tdwg.org/dwc/terms/taxonRank": "Species",
  "http://rs.tdwg.org/dwc/terms/georeferenceProtocol": null,
  "http://rs.tdwg.org/dwc/terms/dataGeneralizations": null,
  "http://rs.tdwg.org/dwc/terms/maximumElevationInMeters": null,
  "http://rs.tdwg.org/dwc/terms/minimumElevationInMeters": null,
  "http://rs.tdwg.org/dwc/terms/infraspecificEpithet": null,
  "http://rs.tdwg.org/dwc/terms/day": "19",
  "http://purl.org/dc/terms/modified": "2021-04-13 02:19:54",
  "http://rs.tdwg.org/dwc/terms/typeStatus": null,
  "http://rs.tdwg.org/dwc/terms/dynamicProperties": null,
  "http://rs.tdwg.org/dwc/terms/establishmentMeans": null,
  "http://rs.tdwg.org/dwc/terms/decimalLongitude": "17",
  "http://rs.tdwg.org/dwc/terms/otherCatalogNumbers": "ISC0372286",
  "http://rs.tdwg.org/dwc/terms/fieldNumber": null
}

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

with

  "http://rs.tdwg.org/dwc/terms/locality": "Upland: Lena parish, �-\"'Üngeby skog�-�; 60 17",

having funky characters.

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

and

curl 'https://linker.bio/line:zip:hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184!/occurrences.csv!/L1,L38568'\
| mlr --icsv --oxtab cat
id                             6044645
institutionCode                ISC
collectionCode                 
ownerInstitutionCode           
collectionID                   f25dc2ec-d71d-4014-abb2-c3b64af25d79
basisOfRecord                  PreservedSpecimen
occurrenceID                   c0fa5188-9440-4727-8cf2-3d963b7de039
catalogNumber                  ISC-F-0124477
otherCatalogNumbers            ISC0372286
higherClassification           Fungi|Basidiomycota|Agaricomycotina|Dacrymycetes|Dacrymycetales|Dacrymycetaceae|Dacrymyces
kingdom                        Fungi
phylum                         Basidiomycota
class                          Dacrymycetes
order                          Dacrymycetales
family                         Dacrymycetaceae
scientificName                 Dacrymyces rubiformis
taxonID                        163733
scientificNameAuthorship       (Fr.) Neuhoff
genus                          Dacrymyces
subgenus                       
specificEpithet                rubiformis
verbatimTaxonRank              
infraspecificEpithet           
taxonRank                      Species
identifiedBy                   L.L. Kennedy
dateIdentified                 1956
identificationReferences       
identificationRemarks          
taxonRemarks                   
identificationQualifier        
typeStatus                     
recordedBy                     S. Lundell
recordNumber                   s.n.
eventDate                      1935-11-19
year                           1935
month                          11
day                            19
startDayOfYear                 323
endDayOfYear                   
verbatimEventDate              19.XI.1935
occurrenceRemarks              There are no mature basisia or spores but I think this is Dacrymyces palmatus (Schw.) Bres.; young state
habitat                        cut twigs and branches of host
fieldNumber                    
informationWithheld            
dataGeneralizations            
dynamicProperties              
associatedOccurrences          
associatedTaxa                 host: Pinus
reproductiveCondition          
establishmentMeans             
lifeStage                      
sex                            
individualCount                
preparations                   
country                        Sweden
stateProvince                  Uppland
county                         Lena
municipality                   Uppsala l
locality                       Upland: Lena parish, �-"'Üngeby skog�-�; 60 17
locationRemarks                
decimalLatitude                60
decimalLongitude               17
geodeticDatum                  
coordinateUncertaintyInMeters  
verbatimCoordinates            
georeferencedBy                
georeferenceProtocol           
georeferenceSources            
georeferenceVerificationStatus 
georeferenceRemarks            
minimumElevationInMeters       
maximumElevationInMeters       
minimumDepthInMeters           
maximumDepthInMeters           
verbatimDepth                  
verbatimElevation              
disposition                    
language                       
recordEnteredBy                Angela Yoon
modified                       2021-04-13 02:19:54
rights                         http://creativecommons.org/publicdomain/zero/1.0/
rightsHolder                   Iowa State University
accessRights                   
recordId                       urn:uuid:c0fa5188-9440-4727-8cf2-3d963b7de039
references                     https://mycoportal.org/portal/collections/individual/index.php?occid=6044645

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

Tracing the origin of the DwC-A to

<hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <http://www.w3.org/ns/prov#generatedAtTime> "2024-04-01T21:54:35.486Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <http://www.w3.org/ns/prov#wasInformedBy> <urn:uuid:03424633-1886-4376-998c-3fffb0147c82> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> <http://www.w3.org/ns/prov#used> <https://www.mycoportal.org/portal/content/dwca/ISC_DwC-A.zip> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .
<https://www.mycoportal.org/portal/content/dwca/ISC_DwC-A.zip> <http://purl.org/pav/hasVersion> <hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <urn:uuid:aab36207-7680-4ae8-b10a-b290cd179158> .

@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

and a recent trace of the source reveals that the archive has not changed (e.g., https://www.mycoportal.org/portal/content/dwca/ISC_DwC-A.zip http://purl.org/pav/hasVersion hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e . )

<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#SoftwareAgent> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Agent> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<https://preston.guoda.bio> <http://purl.org/dc/terms/description> "Preston is a software program that finds, archives and provides access to biodiversity datasets."@en <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <http://purl.org/dc/terms/description> "A crawl event that discovers biodiversity archives."@en <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <http://www.w3.org/ns/prov#startedAtTime> "2025-01-09T21:21:12.633Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <http://www.w3.org/ns/prov#wasStartedBy> <https://preston.guoda.bio> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/ns/prov#usedBy> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Software> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<https://doi.org/10.5281/zenodo.1410543> <http://purl.org/dc/terms/bibliographicCitation> "Jorrit Poelen, Icaro Alzuru, & Michael Elliott. 2018-2024. Preston: a biodiversity dataset tracker (Version 0.10.3-SNAPSHOT) [Software]. Zenodo. https://doi.org/10.5281/zenodo.1410543"@en <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Entity> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/dc/terms/description> "A biodiversity dataset graph archive."@en <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<hash://sha256/ee98f4744ef123e582257679f3b8cadfc118c9024547a22a8c69c05cc58e5de9> <http://www.w3.org/ns/prov#usedBy> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> .
<hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <http://www.w3.org/ns/prov#generatedAtTime> "2025-01-09T21:21:15.126Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <http://www.w3.org/ns/prov#wasInformedBy> <urn:uuid:3580252b-528e-444d-b7a8-8ec81e2b0be7> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> <http://www.w3.org/ns/prov#used> <https://www.mycoportal.org/portal/content/dwca/ISC_DwC-A.zip> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .
<https://www.mycoportal.org/portal/content/dwca/ISC_DwC-A.zip> <http://purl.org/pav/hasVersion> <hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184> <urn:uuid:9280300c-18e8-414d-9fd9-e11e82ef429e> .

@jhpoelen jhpoelen changed the title stream processor appears to stop in record 1b71bf22-066c-49a5-8556-531d9e0e0776 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 suspicious characters found in locality description of record 1b71bf22-066c-49a5-8556-531d9e0e0776 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 Jan 9, 2025
@jhpoelen jhpoelen changed the title suspicious characters found in locality description of record 1b71bf22-066c-49a5-8556-531d9e0e0776 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 suspicious characters found in locality description of record c0fa5188-9440-4727-8cf2-3d963b7de039 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 Jan 9, 2025
@jhpoelen jhpoelen changed the title suspicious characters found in locality description of record c0fa5188-9440-4727-8cf2-3d963b7de039 https://mycoportal.org/portal/collections/individual/index.php?occid=6044643 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 suspicious characters found in locality description of record c0fa5188-9440-4727-8cf2-3d963b7de039 https://mycoportal.org/portal/collections/individual/index.php?occid=6044645 as seen in DwC-A with content id hash://sha256/64e36ad6feb7eb14a70b106385def3de5b18c773fe8c96252b50f5859374f184 Jan 9, 2025
@jhpoelen
Copy link
Member Author

jhpoelen commented Jan 9, 2025

as far as getting the terminal to "break" on control characters in streamed data -

https://unix.stackexchange.com/questions/247999/how-to-prevent-random-console-output-from-breaking-the-terminal

which suggests a systematic approach to avoiding the issue is not trivial.

Note that this should not affect logging to a file, so when doing stream processing direct to some file instead of scrolling through the terminal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant