how to shrink the database
Thread poster: jmutka
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
Mar 4, 2015

i have an on-going literary translation project where the database is growing offensively large (too big to email), basically because multiple people are collaborating, and to get their work into the TM requires repeated, sometimes redundant imports.

the same TU may have 3 or 4 translations, and on a database maintenance pass the redundant ones get deleted, but i have found no way to shrink the database accordingly.

is this possible? how?

the database commu
... See more
i have an on-going literary translation project where the database is growing offensively large (too big to email), basically because multiple people are collaborating, and to get their work into the TM requires repeated, sometimes redundant imports.

the same TU may have 3 or 4 translations, and on a database maintenance pass the redundant ones get deleted, but i have found no way to shrink the database accordingly.

is this possible? how?

the database community seems to have mixed opinions about SQL-shrinking...
Collapse


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
bump. Mar 9, 2015

? anyone?

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 21:08
English to Hungarian
+ ...
Nope Mar 9, 2015

There has been no response because your question doesn't make much sense. You didn't provide specific numbers so it's not at all clear that anything unusual or "wrong" is going on. If emailing is the bottleneck, that's easily resolved by emailing zipped tmx files instead of emailing sdltm. Zipping sdltm will probably be sufficient too. Use dropbox if all else fails.
If you really want to try and see if you can reduce the size of the sdltm file, export your TM into TMX, create a new TM and
... See more
There has been no response because your question doesn't make much sense. You didn't provide specific numbers so it's not at all clear that anything unusual or "wrong" is going on. If emailing is the bottleneck, that's easily resolved by emailing zipped tmx files instead of emailing sdltm. Zipping sdltm will probably be sufficient too. Use dropbox if all else fails.
If you really want to try and see if you can reduce the size of the sdltm file, export your TM into TMX, create a new TM and import the TM. That way you start with a clean slate.

[Edited at 2015-03-09 11:34 GMT]
Collapse


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
shrink shrink shrink Mar 10, 2015

FarkasAndras wrote:
your question doesn't make much sense. You didn't provide specific numbers so it's not at all clear that anything unusual or "wrong" is going on. If emailing is the bottleneck, that's easily resolved by emailing zipped tmx files instead of emailing sdltm. Zipping sdltm will probably be sufficient too. Use dropbox if all else fails.
If you really want to try and see if you can reduce the size of the sdltm file, export your TM into TMX, create a new TM and import the TM. That way you start with a clean slate.

[Edited at 2015-03-09 11:34 GMT]


nothing is going "wrong." i just want to shrink the TM. in database land the size of the data is a perpetual concern - i guess not so once a database has been renamed a TM.

i AM zipping the files for email - SDLTM shrinks quite nicely as well.

but yes, export to TMX / recreation will do what i'm after here. i haven't done it (yet) because i'm not sure how much, if any, metadata would get lost in doing so. what i was looking for is a utility/helper program that would rearrange/reindex with one click - or better yet, a right click function to do so within sdl studio, like outlook has, for example.

cheers.


 
Jan Sundström
Jan Sundström  Identity Verified
Sweden
Local time: 21:08
English to Swedish
+ ...
use a TM editing/QA tool Mar 30, 2015

Hi!

It would probably make most sense to use a TM editing/QA tool like Xbench:
http://www.xbench.net/index.php/download

It's not a "one-click-solution" but on the other hand you have much more control over the data, i e exactly which duplicates you want to cull (oldest, newest, by a certain user etc).

/Jan


 
Meta Arkadia
Meta Arkadia
Local time: 02:08
English to Indonesian
+ ...
Not one click... Mar 30, 2015

jmutka wrote:
what i was looking for is a utility/helper program that would rearrange/reindex with one click

If it's not built-in in Trados, you could open the SDLTM in SQLite, and close it with the SHUTDOWN COMPACT command. Never tried it myself, no guarantees, please use a copy of the SDLTM file.

Cheers,

Hans


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
which app has SHUTDOWN COMPACT? Apr 9, 2015

Meta Arkadia wrote:
you could open the SDLTM in SQLite, and close it with the SHUTDOWN COMPACT command.
Hans

just tried on a copy of one of my TM's. i was using SQLite Expert Personal 3, and the vacuum-function. which i assume is the right one to use.

it shrank some 5% in size - but that one did NOT contain a lot of deletions

which exact app are you using - that has the SHUTDOWN COMPACT -function? but perhaps that is the same as vacuum?

cheers.


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
xbench - are it's filtering functions substantially better than what you can do with trados studio? Apr 9, 2015

Jan Sundström wrote:
It would probably make most sense to use a TM editing/QA tool like Xbench:
http://www.xbench.net/index.php/download

you have much more control over the data, i e exactly which duplicates you want to cull (oldest, newest, by a certain user etc).

/Jan

actually you're addressing a concern that's different from my original query here. but a valid one nonetheless. xbench seems to be $, so before i buy it you'll have to tell me that it's MUCH better than what i can do with studio 2011 filters. they're quite comprehensive, albeit with a cumbersome UI.

cheers.


 
Walter Blaser
Walter Blaser  Identity Verified
Switzerland
Local time: 21:08
French to German
+ ...
Another useful tool is Olifant Apr 9, 2015


actually you're addressing a concern that's different from my original query here. but a valid one nonetheless. xbench seems to be $, so before i buy it you'll have to tell me that it's MUCH better than what i can do with studio 2011 filters. they're quite comprehensive, albeit with a cumbersome UI.


If you are looking for a free tool to clean up your TMs (I am talking here about cleaning up in the sense of getting rid of unwanted content, such as duplicates, TUs with identical source and target, etc. and not the physical shrinking you posted about), consider downloading Olifant. It is free open SW and works fine. It takes some time to get familiar with its functions, but once you are, it is a wonderful tool, also because it can handle very large files, which most other QA tools cannot.

Walter


 
Meta Arkadia
Meta Arkadia
Local time: 02:08
English to Indonesian
+ ...
VACUUM it is Apr 10, 2015

jmutka wrote:
which exact app are you using - that has the SHUTDOWN COMPACT -function? but perhaps that is the same as vacuum?

You're absolutely right. SHUTDOWN COMPACT does not work for SQLite3 (am I glad I didn't offer any guarantees...), and VACUUM does. I mixed up my databases once more, I'm afraid. Sorry!
It must be clear from this, I'm not an SQLite expert. I actually started trying to learn it only a few weeks ago. Aspirin donations welcome.

However, I checked a few SDLTM files in an SQLite browser, and it struck me that they don't offer Auto Vacuum.



This could very well be because the set-up makes it less useful, which would explain why your use of VACUUM wasn't very effective either. Don't take my word for it, thought, as you undoubtedly already concluded.

Cheers,

Hans

[Edited at 2015-04-10 02:21 GMT]


 
Meta Arkadia
Meta Arkadia
Local time: 02:08
English to Indonesian
+ ...
Drop Index Apr 10, 2015

What causes the database to inflate, is quite probably the index. Dropping the index would make sharing the database via e-mail lots faster. Time that will probably be lost again when they will CREATE_INDEX at the other end. The Automatic Index for SDLTM is enabled by default in the Pragmas... See more
What causes the database to inflate, is quite probably the index. Dropping the index would make sharing the database via e-mail lots faster. Time that will probably be lost again when they will CREATE_INDEX at the other end. The Automatic Index for SDLTM is enabled by default in the Pragmas.
See
http://sqlite.org/pragma.html#pragma_shrink_memory
and
https://www.sqlite.org/lang_dropindex.html

FWIW,

Hans
Collapse


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
dropping the index would probably help shrink the file, but Apr 11, 2015

Meta Arkadia wrote:
What causes the database to inflate, is quite probably the index.
Hans

i assume yes, but then if the recipient is going to re-index immediately upon receipt, it's perhaps not great for the workflow.

haven't (yet) tried how much space the index takes up, and whether its space is vacuumable.


 
jmutka
jmutka
United States
Local time: 15:08
Finnish to English
+ ...
TOPIC STARTER
summary / resulting workflow / sdltm shrinking Apr 11, 2015

just to summarize the thread - and the wisdom it now contains:

- sdltm is an SQlite database file
- if its size is a concern, it can be shrunk by opening the file up in various SQlite apps, for example SQLite Expert Personal 3, and performing a "vacuum."
- the file will only shrink if it has contained a large number of TUs that have been, for one reason or another, later deleted.
- doing this has no adverse side-effects, but actual mileage may vary

(-
... See more
just to summarize the thread - and the wisdom it now contains:

- sdltm is an SQlite database file
- if its size is a concern, it can be shrunk by opening the file up in various SQlite apps, for example SQLite Expert Personal 3, and performing a "vacuum."
- the file will only shrink if it has contained a large number of TUs that have been, for one reason or another, later deleted.
- doing this has no adverse side-effects, but actual mileage may vary

(- it may shrink more if you also drop its index - but haven't tried this.)

thanks for all.
Collapse


 
Riccardo Schiaffino
Riccardo Schiaffino  Identity Verified
United States
Local time: 13:08
Member (2003)
English to Italian
+ ...
There is a free version of Xbench, also Apr 26, 2015

jmutka wrote:


... nonetheless. xbench seems to be $, so before i buy it you'll have to tell me that it's MUCH better than what i can do with studio 2011 filters.

cheers.


Version 2.9 of Xbench is still completely free, and the (newer and improved) version 3.0 also is free during a trial period (I think).

I've been using Xbench for years, and in my opinion it's absolutely worth the cost of the license.


 
Vladimir Pochinov
Vladimir Pochinov  Identity Verified
Russian Federation
Local time: 21:08
English to Russian
WeTransfer free service for sending up to 10GB files Apr 26, 2015

jmutka wrote:

i have an on-going literary translation project where the database is growing offensively large (too big to email)


If you are primarily interested in advice on how to send large files, you can use WeTransfer - https://www.wetransfer.com/

P.S. I stand corrected - the free version is limited to 2GB per file. WeTransfer Plus costs $10 per month and allows to send 10GB files.

[Edited at 2015-04-26 10:29 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

how to shrink the database







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »