Description
I’ve rebased release–2.1 branch into master, all my investigation is base on that state. Replicate that step if you needed.
In duka/ore/processor.py
, if the user added --local-time
to the command line, the program just replace the ticks timezone, which called tzinfo with tz.tzlocal()
. But we did it in a wrong way. .replace()
function will return a new object, which is the modified one. That means the statements does nothing. Just changing the line to an assignment one still doesn’t work. The problem is much more complex. Let’s me explain this to you.
In Dukascopy Historical Data Feed, which duka intentionally want to be its command-line alternative one, treats local time options differently than our current implementation. Our target is to generate the same result as the official one.
Here’s the scenario: I’ve been live in a UTC+8 country, I want to download XAU/USD candlesticks data from date 2017–09–06 to 2017–09–07, and the timeframe I need is 4 hours.
If I chose Local option on the official tool, I’will get something like this:
Local time Open High Low Close Volume
06.09.2017 00:00:00.000 1335.251 1344.328 1335.151 1340.958 18168971.97
06.09.2017 04:00:00.000 1340.898 1342.172 1338.062 1341.571 4586580.001
06.09.2017 08:00:00.000 1341.569 1342.429 1337.639 1337.761 8711390.002
06.09.2017 12:00:00.000 1337.762 1339.498 1336.299 1337.438 9259679.997
06.09.2017 16:00:00.000 1337.418 1341.828 1337.261 1339.292 13025489.99
06.09.2017 20:00:00.000 1339.292 1340.892 1335.888 1339.868 19017017.99
07.09.2017 00:00:00.000 1339.868 1340.238 1331.598 1334.172 14472906.98
07.09.2017 04:00:00.000 1334.171 1334.572 1332.501 1333.401 3560730.003
07.09.2017 08:00:00.000 1333.388 1335.878 1332.591 1334.808 8341879.984
07.09.2017 12:00:00.000 1334.828 1338.179 1333.978 1337.429 10472150
07.09.2017 16:00:00.000 1337.438 1340.361 1336.971 1338.631 11213249.98
07.09.2017 20:00:00.000 1338.632 1349.328 1338.632 1346.978 24777360.03
If i chose GMT:
Gmt time Open High Low Close Volume
06.09.2017 00:00:00.000 1341.569 1342.429 1337.639 1337.761 8711390.002
06.09.2017 04:00:00.000 1337.762 1339.498 1336.299 1337.438 9259679.997
06.09.2017 08:00:00.000 1337.418 1341.828 1337.261 1339.292 13025489.99
06.09.2017 12:00:00.000 1339.292 1340.892 1335.888 1339.868 19017017.99
06.09.2017 16:00:00.000 1339.868 1340.238 1331.598 1334.172 14472906.98
06.09.2017 20:00:00.000 1334.171 1334.572 1332.501 1333.401 3560730.003
07.09.2017 00:00:00.000 1333.388 1335.878 1332.591 1334.808 8341879.984
07.09.2017 04:00:00.000 1334.828 1338.179 1333.978 1337.429 10472150
07.09.2017 08:00:00.000 1337.438 1340.361 1336.971 1338.631 11213249.98
07.09.2017 12:00:00.000 1338.632 1349.328 1338.632 1346.978 24777360.03
07.09.2017 16:00:00.000 1346.981 1349.522 1342.378 1348.218 13315830.04
07.09.2017 20:00:00.000 1348.118 1349.839 1347.309 1348.838 3759689.994
Current program implementation will output something like this:
06.09.2017 08:00:00.000 1341.569 1342.429 1337.639 1337.761 8711390.002
06.09.2017 12:00:00.000 1337.762 1339.498 1336.299 1337.438 9259679.997
06.09.2017 16:00:00.000 1337.418 1341.828 1337.261 1339.292 13025489.99
06.09.2017 20:00:00.000 1339.292 1340.892 1335.888 1339.868 19017017.99
07.09.2017 00:00:00.000 1339.868 1340.238 1331.598 1334.172 14472906.98
07.09.2017 04:00:00.000 1334.171 1334.572 1332.501 1333.401 3560730.003
07.09.2017 08:00:00.000 1333.388 1335.878 1332.591 1334.808 8341879.984
07.09.2017 12:00:00.000 1334.828 1338.179 1333.978 1337.429 10472150
07.09.2017 16:00:00.000 1337.438 1340.361 1336.971 1338.631 11213249.98
07.09.2017 20:00:00.000 1338.632 1349.328 1338.632 1346.978 24777360.03
07.10.2017 00:00:00.000 1346.981 1349.522 1342.378 1348.218 13315830.04
07.10.2017 04:00:00.000 1348.118 1349.839 1347.309 1348.838 3759689.994
It’s quite obvious that Line 1 at GMT is correspondent to Line 3 at Local, then Line 2 is to Line 4, etc.. We can easily conclude that, if the user wants a local version of data, he actually wants the data begins from the midnight of the start date and ends to the midnight of the end date in his locale.
Then I started changing some related code. Like in duka/ore/processor.py
, instead of simply fetching range(0, 24)
hours, I changed to an more precise day-and-hour way. First converting user-inputted start date and end date to datetime instances with local timezone tzinfo, then convert to GMT start datetime and end datetime.
Under fetch.py:
if local_time:
day = datetime.combine(day, datetime.min.time()) # convert to datetime, with start at midnight
day = day.replace(tzinfo=tz.tzlocal())
day = day.astimezone(tz.tzoffset(None, 0))
tasks = []
for i in range(0, 24):
delta_day = day + timedelta(hours=i)
url_info = {
'currency': symbol,
'year': delta_day.year,
'month': delta_day.month - 1,
'day': delta_day.day,
'hour': delta_day.hour
}
tasks.append(asyncio.ensure_future(get(URL.format(**url_info))))
I tried after these modifies but still not working as expected. I’ve find something interesting. Since we don’t preserve each fetch results return state (we just adding its serialized result to BufferIO buffer), we need add_hour function to keep the final result hour part correct. And current implementation is under the assumption of market opens at the beginning of GMT day (hour_delta = 0 on Line 42, under duka/ore/processor.py) and always closes at the end of GMT day.
I’m confused by this function pretty much, here’s some reasons:
- Why we could just assume
hour_delta
begins from zero, is there any possibility that some market doesn't begin at GMT midnight? add_hours
function treatsticks[0]
differently ifticks[0]
is Saturday or the first day of an year. Why?- Since we don’t preserve fetch results return state. We don’t know whether the result returns empty or data. It’s all being combined or reduced! In the above scenario, XAU/USD market closed at 5 a.m. and re-opened at 6 a.m. in my locale, but current implementation will output the market closed at 11 pm. and re-opened at midnight. When encounter Monday, it will output the market closed at 6 p.m.. This question is actually the same and Question 1.
To resolve that issue, I think the program will need quite a large refactor, and I like to hear the opinions from original author and the community.