This repo contains the frontend for Korp, Språkbanken's word research platform using the IMS Open Corpus Workbench (CWB). Korp is a great tool for searching and visualising natural language corpus data.
Korp is mainly developed by Språkbanken at the University of Gothenburg, Sweden. Contributions are also made from other organizations that use the software.
Documentation:
- Frontend documentation
- Backend documentation
- Sparv - The pipeline used to tag and otherwise process raw Swedish-language corpus data is documented here
- Språkbanken's Korp configuration directory (supplement to documentation)
Install yarn: https://yarnpkg.com
- install all dependencies:
yarn - run development server:
yarn start - build a dist-version:
yarn build
Declare dependencies using yarn add pkgor yarn add --dev pkg for dev dependencies.
npm has not worked previously, but the status is unknown right now.
We use webpack to build Korp and webpack-dev-server to run a local server. To include new code or resources, require or use import them where needed:
import { aFunction } from 'new-dependency'
or
nd = require("new-dependency")
nd.aFunction()
or
imgPath = require("img/image.png")
myTemplate = `<img src='${imgPath}'>`
Some dependencies are only specified in app/index.ts.
About the current loaders in webpack.config.js:
pugandhtmlfiles: allsrc-attributes in<img>tags and allhrefs in<link>tags will be loaded by webpack and replaced in the markup. Uses file loader so that requiring apugorhtmlfile will give the path to the file back.jsfiles are added to the bundle- all images and fonts are added to the bundle using file loader and gives back a file path.
cssandscssare added to the bundle.urls will be loaded and replaced by webpack.
In addition to this, some specific files will simply be copied as is, for example Korp mode-files.
Use config.yml for settings needed in the frontend. In some cases, mode-files can be used. For example
it is possible to have different backends for modes.
There are several instances of Korp, here are a list of some:
- Språkbanken Text
- The Language Bank of Finland (Kielipankki)
- Iceland / Stofnun Árna Magnússonar í íslenskum fræðum
- Tromsø / Giellatekno
- Copenhagen / Institut for Nordiske Studier og Sprogvidenskab
When developing, the frontend is served at http://localhost:9111 by default.
Host and port can be changed by the environment variables:
KORP_HOST=<host>KORP_PORT=<port>
Environment variables can be entered in the .env file, which is git-ignored.
It is also possible to serve the frontend from HTTPS using the environment variables:
KORP_HTTPS=trueKORP_KEY=<path_to_key>-key.pemKORP_CERT=<path to cert>.pem
The key and cert can be created using mkcert.
mkcert korp.spraakbanken.gu.se
mkcert -install
Now use korp.spraakbanken.gu.se as the value for KORP_HOST. It must also be added
to /etc/hosts.
Development is done on the dev branch. These changes are not necessarily yet stable and well-tested.
Once tested, they can be merged to the master branch in a release.
When doing a release:
- Update version in
package.jsonto the next version - Add relevent changes to
CHANGELOG.md - Check that the user manual and development documentation is up to date
- Merge
devtomaster(using--no-ff) - Tag the merge commit with the new version (prefixed with
v, see the other tag names)
As an external developer, when forking this respository, you may choose to pull from dev and/or master, depending on your needs for latest versus stable changes.