I am happy to announce that nine days after stage one was complete all of the files in stage two have now been imported into the site. To explain a little more about what is included in each stage. Stage one included all the files that I was able to extract meta data from the SVG files. That usually included things like a title, date, description, tags and a creator. Not all stage one files had each meta data field but often more than one. When the title was not available in stage one files the title was created from the filename if it was possible.
Stage two files included all files that had no meta data included in the SVG file. In this case the filename occasionally had the creator, date and a title was able to be created from the filename. It is not exact since the file name scheme is inconsistent over time. Dates were harder, since there were no dates in the meta data the dates are often estimated based on the last file that had a date. When no date was able to be estimated based on the last date available by ID a default date of Jan 1, 1970 should have been assigned. Some creator names we included in the filenames, and when I noticed the name, or was able to pick it out those creator names were removed from the title and added as the creator. I am sure I missed many.
Stage three files, are all the files from stage one and stage two that have incomplete meta data and little to no clue in the filename about what they are. These files need to be done by hand. I might see if I can cross reference them with what is available in the Wayback machine but these will take longer to process. These files will also include files that have already been added that I simply missed or are having problems processing as SVG’s or images. I will probably leave many of these for a bit since I think I need a break from looking at spreadsheets of files.
Next on the site will be some clean up of things that will just make it easier to maintain the site. Here are some things on the list the next little while:
- Ability to submit changes to existing listings from the listing pages themselves when you are logged in.
- Improvement to how the quick search bar at the top and the search results page.
- Getting back to the JSON API to include the ability to POST new clipart to the site via the API and accept edits to existing items.
- Some automated processes to flag when images are not processing correctly.
- Friendlier login, registration and profile pages. This includes nicer edit pages for creators to edit listings.
If any of the original Openclipart creators would like to join it is possible for you to have the Openclipart submissions linked to your account here. This allows you to do several things:
- Edit the titles, description, dates and tags of the already included Openclipart items here on the site.
- Eventually export a CSV file of just your submissions.
- Get some statistics of downloads.
- Other suggestions? Leave a comment.
To get started you need to create an account and drop me a note with your original Openclipart user name and we will sort out what items are yours. This is still a work in progress and I am still working out some of the edit screens.
I think it is important that the original creators get credit for their work here. While that is not required under the CC0 I think the creators behind the artwork should know that their work is appreciated.
I have also updated the CSV exports of the clipart listings. The CSV file export tool will be getting a little bit of an update in the next little bit since it currently only exports the Openclipart sourced files. Now that new files can be uploaded here I need to make sure those other listings get included.