Configuration¶
Configure the Application for Development or Production¶
Finally, create a file dissemin/settings/__init__.py
with this content:
# Development settings
from .dev import *
# Production settings.
from .prod import *
# Pick only one.
For most of the settings we refer to the Django documentation.
Logs¶
Dissemin comes with a predefined log system. You can change the settings in dissemin/settings/common.py
and change the default log level for production and development in the corresponding files. When using Dissemin from the shell with ./manage shell
you can set the log level for console output as environment variable with:
export DISSEMIN_LOGLEVEL='YOUR_LOG_LEVEL'
When using in production make sure that apache collects all your log message. Alternatively you can send them to a separate file by changing log settings.
ORCID¶
You can either use production ORCID or its sandbox. The main difference is the registration process.
You are not forced to configure ORCID to work on Dissemin, just create a super user and use it!
Production¶
On your ORCID account got to Developer Tools and register an API key. As a redirection URL you give the URL to your installation.
Set ORCID_BASE_DOMAIN
to orcid.org
in the Dissemin settings.
On the admin surface got to Social Authentication, set the provider to orcid.org
and enter the required data.
Now you can authenticate with ORCID.
Sandbox¶
Create an account on Sandbox ORCID.
Go to Developer Tools, verify your mail using Mailinator. You must not choose a different provider.
Set up a redirection URI to be localhost:8080 (supposed to be where your Dissemin instance server is running).
Now proceed as in Production, but with sandbox.orcid.org
.
Repositories¶
By default, Dissemin is not configured to let users deposit in any repository. To enable this feature, you need to add repositories by visiting the Django admin interface at /admin/deposit/repository/ and creating a new repository there. Depending on the repository, you will need to supply various data fields.
In all cases you need to provide a name for the repository, a short description which will be shown to users, and a logo.
HAL¶
To set up the connection with HAL, use the following settings:
- Protocol: HAL Protocol (Sword)
- OAI source: HAL
- Username: name of the user account under which the deposits should be made by default. This account can be created at https://hal.archives-ouvertes.fr/ for production and https://hal-preprod.archives-ouvertes.fr/ for testing
- Password: corresponding password for this account
- Endpoint: https://api.archives-ouvertes.fr/sword/ for production, https://api-preprod.archives-ouvertes.fr/sword/ for testing
- API key can be left blank
Zenodo¶
To set up the connection with Zenodo, use the following settings:
- Protocol: Zenodo
- OAI source: Zenodo
- Username and password can be left blank
- Endpoint: https://zenodo.org/api/deposit/depositions for production and https://sandbox.zenodo.org/api/deposit/depositions for testing
- API key: should be obtained from your Zenodo account, created on the corresponding instance (either https://zenodo.org for production, or https://sandbox.zenodo.org for testing)
Shibboleth¶
Shibboleth is a SAML based authentication mechanism and widely used in academic research. CAPSH has joined the French federation RENATER in order to provide a login with eduGAIN. In the SAML world there is usually an IdentityProvider (IdP) that permits (local) authentication and a Service Provider (SP) that offers some kind of service. In this case, https://dissem.in/ will be the SP.
Relevant documentations can be found at Shibboleth and RENATER. They cover some of the understanding of how Shibboleth works as well as instructions on participating and register a SP.
The entityID
for our production service is https://sp.dissem.in/shibboleth
while we use https://sp.sandbox.dissem.in/shibboleth
for our sandbox.
The guide assumes in the following, that production and sandbox run on the same machine.
Installation¶
Shibboleth requires mod_shibboleth
and a daemon.
Official packages are available for RedHat and openSUSE.
For Ubuntu and Debian based system, please follow the guide from SWITCHaai.
The certs and keys for signing and encryption might be missing. They can be self signed certificates. To generate them, run:
openssl req -config config-cert.conf -new -x509 -nodes -days 1095 -keyout sp-encrypt-key.pem -out sp-encrypt-cert.pem
openssl req -config config-cert.conf -new -x509 -nodes -days 1095 -keyout sp-signing-key.pem -out sp-signing-cert.pem
where the config file looks like:
[ req ]
default_bits = 4096
distinguished_name = req_distinguished_name
prompt = no
x509_extensions = req_ext
[ req_distinguished_name ]
C = FR
O = CAPSH
CN = dissem.in
emailAddress = team@dissem.in
[req_ext ]
subjectAltName = @alt_names
[ alt_names ]
DNS.1 = dissem.in
DNS.2 = sandbox.dissem.in
DNS.3 = https://sp.dissem.in/shibboleth # entityID production
DNS.4 = https://sp.sandbox.dissem.in/shibboleth # entityID sandbox
Warning
When the certificates expire and we have to renew them, we must communicate to RENATER! For a short period of time we have to provide both certificates, the old and new ones, so that the IdPs can update to the new one and the transition is seamless.
Note
In theory, we can use the same certificate as for the https server, but this is disadvantageous with Let’s Encrypt since with every new certificate, we would need to change our shibboleth metadata.
shibboleth2.xml¶
This is the central configuration file for Shibboleth where the magic happens. After a change of the configuration, touch the file, to tell the shibboleth deamon to reload. This does not disturb the service. Depending on the changes, the metadata for our entityId change.
Since RENATER offers a production as well a test federation, we need to create different metadata. This will be done via ApplicationOverride as there are little differences only, that must be set explicetely:
- entityID
- discoveryURL
- MetadataProvider
- MetadataGenerator
You can find our (sample) shibboleth2.xml
als well as our attribute-map.xml
in our GibLab repository. Check the folder provisioning
.
Make also sure, that the settings comply with SAML Metadata Published by RENATER.
Apache¶
In order to make Shibboleth available on the virtual host, add:
<Location /Shibboleth.sso>
setHandler shib
</Location>
This way Shibboleth gets precedence over WSGI for /Shibboleth.sso
.
In theory, you could use any other alias, but this is somewhat of a standard.
For our sandbox, make sure to add:
<Location />
ShibRequestSetting applicationId sandbox
AuthType shibboleth
Require shibboleth
</Location>
right before the WSGI-part. This makes sure to use the ApplicationOverride for sandbox that we mentioned above.
Django¶
In Django, only a few things need to be configured.
You need to set SHIB_DS_SP_URL
which is the URL that leads to the Daemons Login site, which will perform a redirect to the choosen IdP. This is for production https://dissem.in/Shibboleth.sso/Login
. Then you will have to point to the DiscoFeed. You can do this either by pointing to a URL or file, usually the URL is fine and is for production https://dissem.in/Shibboleth.sso/DiscoFeed
.
Note
In development settings, both are predefined and there’s no necessarity to change them. However, an authentication won’t be possible, because the value are somewhat made up.
Then you need to set SHIBBOLETH_LOGOUT_URL
in the settings. This points to the Daemons logout site, e.g. https://dissem.in/Shibboleth.sso/Logout
and logs out the user from the Daemon. The user is eventually redirected to Dissemins start page.
On development environment you shouldn’t set this value.
Troubleshooting¶
Systemd timeout¶
Under certain circumstances shibd
does take along time to start.
This is due to the fact that we process the whole eduGAIN IdP metadata.
The crucial time killer is the validation of signatures.
Usually this is only an issue when starting shibd for the first time, since cached IdPs won’t be validated again.
There are three ways to solve this:
- Increase timeout on systemd for shibd
- Stop shibd and initialize it manually
- Turn off validation.
Of course, 3. is not an option!
The standard approach to solve this is usually to use MDQ, where IdPs will be checked in case of need. This system is not (yet) suitable for a DiscoveryService since it needs to know all IdPs.
Missing attributes¶
Albeit we postulated our demanded attributes within eduGAIN, this does mean, that an IdP will release the requested attributes. It is up to the IdP which attributes it releases to a SP. Ususally they will ship eduPersonTargetedId, surname and givenName. In case of need we can ask the IdP to release more attributes.