Caching HTTPS GitHub credentials

When you try to clone the repository with HTTPS protocol, you must see the following message at least once.

$ git clone https://github.com/foo/bar.git
Cloning into 'bar'...
Username for 'https://github.com': xxx

It’s easy to resolve this issue. We just put the username and the token issued in the personal access token page in the GitHub settings.

But what if you do not have any option to set the token by yourself? In a situation like using Homebrew to clone the repository with HTTPS protocol?

We can cache the credentials of GitHub in Git so that we can use these credentials even if we do not have any way to control the external script execution.

  1. Please make sure to install GitHub CLI
  2. Run gh auth login and select HTTPS as your preferred procotol.

Now, you should be able to clone the repository with HTTPS protocol without specifying the username and personal access token explicitly. Git automatically takes care of that.

See the official document for more detail.

Specify project specific Python with Poetry

Poetry allows us to manage the Python packages in a sandboxed environment neatly. My experience using Poetry was excellent, and I have replaced my Python way with Poetry from Pipenv.

But I sometimes found Poetry failed to resolve the dependencies because of the unsupported Python version used by the environment. For example, it shows the following error message.

The currently activated Python version 3.7.14 is not supported by the project (^3.9).
Trying to find and use a compatible version.

Some packages I used were not compatible with Python 3.7 in this case. But I was confused because my local PATH leads to Python 3.9 thanks to Pyenv. So how can I specify the project-specific python version?

The answer was simple. It needs to be more to specify the PATH by pyenv. We also need to set the python version by init or env use subcommands.

$ poetry init --python 3.9

OR

$ poetry env use 3.9

It tells the Poetry environment to use this Python version explicitly. With these settings, we can install all dependencies without any problem.

Sign your Git commit with SSH key

I used the GPG key to sign my Git commit. This is because it’s beneficial to show my commit’s identity and authenticity publicly. The signed commit, appropriately associated with the email registered in GitHub, will get the verified mark in the UI.

verified

But you can use the SSH key to sign the commit alternatively. It’s a better and easier way because most GitHub users already should register their SSH keys to push the code to GitHub. Therefore, we do not need to prepare another key only for signing the Git commit.

First, you tell Git to use the SSH key to sign commits and tags as a default way.

$ git config --global commit.gpgsign true
$ git config --global gpg.format ssh

Second, tet the location of the public key you are using.

$ git config --global user.signingkey /PATH/TO/.SSH/KEY.PUB

At last, please make sure to register this key as a Signing keys found in the settings. I thought registering the key in Authentication keys would be enough, but it did not work. Check the Signing keys section in your GitHub account’s SSH and GPG keys setting.

That’s it. You will see your commit is appropriately verified when you submit some patches into GitHub next time.

See the official document for more detail.

To learn more about the general mechanism and usage of Git, this book will be helpful.

Reinterpret Date within other timezone in Rails

The Date type in Rails does not retain the timezone information. The default timezone is consistent with the system Rails provides. But it’s implicit, so we should never forget which timezone we use as a Date type.

We can use in_time_zone to be explicit about which timezone we are aware of.

pry(main)> d = Date.new(2023, 1, 20)
=> Fri, 20 Jan 2023
pry(main)> d.to_time
=> 2023-01-20 00:00:00 +0000
pry(main)> d.in_time_zone('Asia/Tokyo')
=> Fri, 20 Jan 2023 00:00:00.000000000 JST +09:00
pry(main)> d.in_time_zone('Canada/Eastern')
=> Fri, 20 Jan 2023 00:00:00.000000000 EST -05:00

in_time_zone returns the TimeWithZone which inherently retains the timezone information. It’s useful to pass the timezone identifier in string type without changing the system configuration.

And the book, Agile Web Development with Rails 6 will be a more comprehensive guide to understanding the timezone structure in Rails.

Enable the schedule without backfill in Digdag

What differentiates the general workflow framework from the cron is how they manage the idempotency and consistency of the executions. Cron does not have any mechanism to ensure the identity of the execution. Therefore, those using cron should not expect higher-level semantics like exactly-once. Workflow frameworks like Digdag provides granular support in that sense to meet our requirement imposed by the applications.

But it occasionally causes problematic behavior in which we would have trouble understanding what’s happening, like me. I just wanted to restart the pending schedules without backfilling the past executions. Digdag REST API supports /api/schedules/{id}/enable to enable the disabled schedules of the workflow. But it was more complex than I thought. Here is why and the caveat to overcome the situation.

Schedules Times

A workflow in Digdag has several schedule entities which have the following configuration.

    {
      "id": "1",
      "project": {
        "id": "2",
        "name": "my_workflow_project"
      },
      "workflow": {
        "id": "2",
        "name": "my_workflow"
      },
      "nextRunTime": "2023-01-12T01:38:00Z",
      "nextScheduleTime": "2023-01-12T01:00:00+00:00",
      "disabledAt": null
    }

Please note that nextRunTime and nextScheduleTime are different. The nextScheduleTime says, “We are covering the incoming data by this time”. On the other hand, nextRunTime is the actual time when the execution happens. So basically, nextRunTime should always come after nextScheduleTime.

To skip the backfill, we need to be careful about this relationship.

Specifying the next time without backfill

The framework needs to know the next execution time if you want to skip the pending sessions when enabling the schedule. Therefore, the API POST /api/schedules/:id/enable gets the following parameters to skip these sessions.

{
  "skipSchedule": true,
  "nextTime": "2023-01-12T01:39:00+00:00"
}

The nextTime is the time after when we expect the following schedule will run. But please make sure to set newer nextTime than the nextRunTime in the last schedule, not nextScheduleTime. The API interface is well-designed but needs some involvement of our brain-power.

Challenge to specify the current time

If we want to skip the pending sessions until now, what should we do? For instance, we have the last schedule as follows: pending session.

    {
      "id": "1",
      "nextRunTime": "2023-01-12T01:38:00Z",
      "nextScheduleTime": "2023-01-12T01:00:00+00:00",
      "disabledAt": "2023-01-12T01:00:00Z"
    }

Before “2023-01-12T01:38:00Z”, the following request fails because there is no session to skip before the nextRunTime.

POST /api/schedules/1/enable
{
  "skipSchedule": true,
  "nextTime":  <The current time>
}

The response is

{
  "message":"Specified time to skip schedules is already past",
  "status":409
}

After “2023-01-12T01:38:00Z”, it succeeds in skipping the last schedule specified by “nextRunTime” = “2023-01-12T01:38:00Z”. So the lesson from here is that we need to be aware of the previous nextRunTime if you want to skip the pending session until the current time.