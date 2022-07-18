KDE and GNOME Development Work
-
Preprocessing for improving the quality of the output
Based on the results of previous research, I realize that Tesseract does various processing internally before doing the actual OCR. However, some instances still exist where Tesseract can result in a signification reduction in accuracy.
So in this post, I would like to introduce some pre-processing methods that apply directly before passing in Tesseract.
-
GSOC Update 1
GSOC is in full swing and here is my first progress update! I’ve been spending time getting familiar with the Krita code base. The first step in my project was making SVG appear as an export option to test start testing the export code. While this may seem straight forward (I certainly thought it would be) there are a few things that we’ll need to do.
First, how does Krita know what files it can import/export as? Well that is easy enough to answer, in a database. Specifically Krita has a class KisMimeDatabase that stores all available file formats Krita supports. Adding a new option to this database is fairly easy as there are plenty of examples in the KisMimeDatabase.cpp. We can mostly copy/paste how other options are added but replace that file name with svg. Neat .
-
Status update 18/07/2022 – Sam Thursfield
Summer is here!
All my creative energy has gone into wrapping up a difficult project at Codethink, and the rest of the time I’ve been enjoying sunshine and festivals. I was able to dedicate some time to learning the basics of async Rust but I don’t have much to share from the last month. Instead, let me focus on some projects I’m keeping an eye on.
Firstly, in the Tracker search engine, Carlos Garnacho has landed some important features and refactors. The main one being stream-based serializers and deserializers.
This allows more easily backing up and importing data in and out the tracker-store, and cleaning up some cruft like multiple different implementations of Turtle. It seems ideal having a totally stream-based codec so you can process an effectively infinite amount of data, but there is a tradeoff if you serialize data triple-by-triple – the serialized output is much less human-readable and in some cases larger than if you do some buffering and group related statements together. For this reason we didn’t yet land the JSON-LD support.
-
Pitivi GSoC: 2nd Update
This is my second GSoC update blog, in the 2 weeks since my last update, and we have reached further in the port progress.
[...]
In my last update, I said that we will move on to changing the version to GTK4 once the event controllers are implemented, but it didn't go well, not all event controllers were backported and one of the most used ones had a different name in GTK3+ thus, for the time being, we have delayed event controller implementations and moved onto changing the dependencies to GTK4, officially starting the Breaking Phase.
-
