Preprint hosting requirements and priorities discussion
We did this before, when we selected COS. Probably need to have some new priorities about long-term financial stability, etc. (lesson learned).
Bryan Lougheed Fri 31 Jan 2020 5:27PM
What is the approx expected bandwidth per day?
Bruce Caron Fri 31 Jan 2020 5:54PM
Here's another issue. We hope to build the volume of new preprints and of researchers downloading preprints over time. Let's say we start with 1000 new preprints a year, and 10 times that number of people grabbing these (or more?). In five years, we will want to have 3-4 times that number of new preprints, and 10-20 times the number of downloads. (I could be conservative on these numbers).
Dasapta Erwin Irawan Wed 29 Jan 2020 12:28PM
Hi all. Thank you for inviting me in to this conversation. Many important points. We have a similar consideration with the INArxiv. And we would go with local Indonesian server hosted by National Research Institute (LIPI), rin.lipi.go.id. It's still a long way to go, but we had some glonal discussions, and they are interested to contribute.
Dasapta Erwin Irawan Wed 29 Jan 2020 12:30PM
Did you notice that AfricaXiv had an agreement with ScienceOpen. I did not have the details yet, but next week I will have a chat with Jo Haveman the founder of AfricaXiv.
Bruce Caron Wed 29 Jan 2020 4:02PM
One reason to not choose a common preprint server (like Zenodo or Figshare) may be to enhance the number of services and build a more diverse ecosystem of providers. This article from a while back is a good start for this idea <https://thewinnower.com/papers/4172-a-healthy-research-ecosystem-diversity-by-design>.
Bruce Caron Fri 6 Mar 2020 3:00PM
From the AC: "maybe we should request a way to put submissions we are discussing like this on hold/flagged."
Daniel Ibarra Fri 6 Mar 2020 5:25PM
Just to chime in with my perspective (as one of the AC moderators), the problem we have been having is that right now if there are issues with a preprint that don't fulfill our moderation policy, which is posted on our github but all authors do not seem to read, we have to 'reject' the preprint, send them a link to the moderation policy and the authors re-uploaded a new version or fix the metadata etc. Whatever system we choose, it would be helpful to have a sort of hold/flagged category that the AC moderators are looking at so that we can more quickly work with the authors to fix the submission and thus reduce the time from submission to the server to the preprint being fully posted on EarthArXiv.
Victor Venema Fri 6 Mar 2020 9:36PM
Is my guess right that what goes wrong most often are requirements people do not expect?
Such as printing on the first page the that the manuscript is not peer reviewed (it is a pre-prints server after all). Or printing the name of the journal one is submitting to (does not have to be decided yet, one may try multiple ones and why should we force people to submit to journals?).
If that is what makes the work, that would be an additional reason to reconsider these rules.
Daniel Pastor Galán Mon 9 Mar 2020 2:33AM
It is a pre-print but also a post-print server. The disclaimer is necessary, it is good to know if you are reading a non-submitted pre-print. A pre-print already submitted and under review. A pre-print that has been accepted and already includes corrections and modifications from peer review... Maybe we should make this rules more visible, for example in the submission page, or literally asking to fill a simple form that fills that page. But I think it is absolutely necessary.
Daniel Nüst Mon 9 Mar 2020 7:34AM
I was asked by an EarthArXiv moderator to add that information to the PDF and had not considered that before, but it's a very valid concern AND extremely useful. As per requirements, I think the authors should be helped by the preprint platform, and at the same time there is a chance for branding and integrating useful information.
So, when moving to the next platform, I think the generation of either a) an overlay or watermark (like arXiv) or b) automatic generation of a first page from the form metadata.
When the PDF is processed anyway, proper metadata can also be added. This information could then ideally include the final DOI, so there is a good connection between a PDF someone downloads today and the official record in 5 years time.
Christopher Jackson Mon 9 Mar 2020 2:35PM
I so agree with this Daniel! Automated addition of preprint/postprint status is one of the key things to get. I spend a decent amount of time rejecting submissions for this reason. Also: automated/quicker checking of journal conditions around preprinting/self-archiving of post-prints.
Victor Venema Mon 9 Mar 2020 3:21PM
Would it be enough to add this information as metadata (not forcing people to print it on the first page of the PDF)?
I once had permission of a colleague to put our manuscript on EarthArXiv, but all I had was the PDF and he did not want to go to the trouble of generating a new PDF. Then he started saying "we publish open access anyway", "it is confusing to have two versions", anything not to do the additional work. The easier we make it for people the more likely people will move to pre-printing.
(And I remain that we should not force people to submit to journals.)
Daniel Nüst Mon 9 Mar 2020 3:59PM
I would not ask users to do that at all. I would just add it from the system, see e.g. the left hand margin of articles on arXiv:
Hope this clarifies.
Leonardo Uieda Mon 9 Mar 2020 6:39PM
I like the automatic watermark that adds our DOI and the disclaimer. For now, it might be worth putting a link to the instructions somewhere very very visible (impossible to ignore). I have the feeling that a lot of people just don't find them when submitting.
Daniel Ibarra Mon 9 Mar 2020 7:43PM
Yes, we've asked for this via @Tom Narock I believe, but COS never followed through. I like this watermark DOI addition if that's something that a future preprint hosting service could add.
Christopher Jackson Mon 9 Mar 2020 7:59PM
We did indeed ask, and didn’t get anywhere...:-(
Bruce Caron Mon 9 Mar 2020 8:16PM
One (hopefully) great thing about moving to a dedicated preprint platform like Open Preprint System is that feature requests might get more attention. Adding new preprint features to the OSF jams these into a larger queue with other priorities.
Bruce Caron · Tue 28 Jan 2020 5:15PM
It could be better if the host does not rely on commercial cloud services for the platform, but has already made a commitment to running their own servers. This would mean that the preprints are just a minor load on existing capacity. Again, preprints are tiny, compared with data.