Skip to content
This repository was archived by the owner on May 25, 2021. It is now read-only.

COUCHDB-769: Store attachments in the external storage. #82

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gilv
Copy link

@gilv gilv commented Oct 14, 2015

Initial implementation that allows CouchDB to store attachments outside of the database file.
This implementation supports OpenStack Swift and SoftLayer Object store

…mplementation, supports OpenStack Swift and SoftLayer Object store
@@ -407,7 +407,15 @@ db_req(#httpd{method='POST',path_parts=[_,<<"_bulk_docs">>], user_ctx=Ctx}=Req,
true -> [all_or_nothing|Options];
_ -> Options
end,
case fabric:update_docs(Db, Docs, Options2) of
couch_log:debug("chttpd_db: update_docs wellcome. Going to store attachment in external store",[]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welcome! (:

@kxepal
Copy link
Member

kxepal commented Oct 14, 2015

Please fix:

  • Style: try to fit 80 chars line length, put space after comma, no tabs, correctly aligned clauses etc.
  • Reduce amount of debug logs: debug logs should help you understand what going on in your code and in which state it is in places where problems may happens, not produce noise about every logical step.

@kxepal
Copy link
Member

kxepal commented Oct 14, 2015

Would be nice to see some tests for this to prove that this all works.

NewNewDoc = fabric_attachments_handler:inline_att_store(Db, NewDoc); %store_single_document
"false" ->
couch_log:debug("Store inline attachmets in Swift disabled",[]),
NewNewDoc = NewDoc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not assign variable in case, but assign case result to variable.

@kxepal
Copy link
Member

kxepal commented Oct 14, 2015

Is it possible to decouple logic of working with external attachments from the logic that works with internal ones? Mixing them makes quite hard to follow the track.

@gilv
Copy link
Author

gilv commented Dec 24, 2015

@kxepal : you asked to "Is it possible to decouple logic of working with external attachments from the logic that works with internal ones? Mixing them makes quite hard to follow the track."

Can you please explain me what would you like to see exactly? Perhaps i didn't understand it.
Currently fabric_attachment_handler is the one that responsible to store attachments externally. This module does't really care if it's internal or external, they both go to the same object store.

@kxepal
Copy link
Member

kxepal commented Dec 24, 2015

@gilv I mean to get rid all case AttExternal of "external" -> ...; _ -> ... end and instead have some function(s) with two clauses: one to process attachments stored inside db file and other - in some external storage. This makes logic of processing internal and external attachments separated, clean and easy to track. This also helps to avoid situations like this one when depending on attachment type.

@gilv
Copy link
Author

gilv commented Jan 4, 2016

@kxepal @rnewson
I provided another patch, implementing most of the comments.
Here is what is not implemented in this patch:

  1. Remark on the AttFun. Not yet implemented. Original remark: “AttFun could be actually an attachment data or some fold function? Something went wrong if you have to do this.”
  2. usage of couch_util:to_hex
  3. still no unitests

@rnewson
Copy link
Member

rnewson commented Jan 4, 2016

Given the nature of the patch, this cannot merge without tests, assuming all style and other issues are resolved.

false ->
couch_log:debug("externalize attachment: disabled",[]),
NewDocs = Docs
end,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner if the result of the case statement was assigned to NewDocs. i.e, NewDocs = case...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, and really, it should be fabric:update_docs itself that makes this decision, the http presentation layer should not know.

@rnewson
Copy link
Member

rnewson commented Jan 4, 2016

summary: there's too much logic in the http layer, fabric should be the place where the decisions are made.

@gilv
Copy link
Author

gilv commented Jan 7, 2016

@kxepal I need your advice on the following:
Consider the following use case:

  1. Attachment is stored externally
  2. User download attachment
  3. CouchDB fails to download attachment from Swift ( for example network failure )

The relevant code in "chttpd_db" is:

ExtStoreID = couch_att:fetch(att_extstore_id,TmpAtt),
case ExtStoreID of
undefined ->
couch_log:debug("chttpd_db: Att is not stored in the external store pn",[ExtStoreID]),
Att = TmpAtt;
_ ->
Att = fabric_att_handler:att_get(Db, TmpAtt),
couch_log:debug("chtttpd_db: Got attachment from the external store pn",[ExtStoreID])
end,

My question is: what is the best way to handle this?
Should i do something like this:

case fabric_att_handler:att_get(Db, TmpAtt),
{ok, Att} ->
couch_log:debug("chtttpd_db: Got attachment from the external store pn",[ExtStoreID])
{error,_} ->
throw({
bad_request,
"Please try again later"
})
.....

@kxepal
Copy link
Member

kxepal commented Jan 7, 2016

@gilv It's not a bad request since it's not a user fault that Swift is down. HTTP 503 or HTTP 502 looks good here. But yes, the only thing you can do is to fail and throw error back.

Btw, what's the timeout you use for fetching remote attachments? Do you stream response from Swift or buffer it in memory before return back to user?

@gilv
Copy link
Author

gilv commented Jan 20, 2016

@kxepal @rnewson
Few notes related recent patch

  1. Please ignore all debug / info print statements. At some point i will remove them at once.
  2. I addressed most of the comments.
  3. Remark on the AttFun. Not yet implemented. Original remark: “AttFun could be actually an attachment data or some fold function? Something went wrong if you have to do this.”
  4. Still no unitests / functional tests.
  5. Attachment retrieval from the object store - keep all data in memory. I need to figure out how to use streaming there.

@gilv
Copy link
Author

gilv commented Jan 29, 2016

@kxepal Can you please review the recent code?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants