True Cause behind Additional Verification in ACM

AWS Certificate Manager (ACM) is a service allowing us to manage the complexity around SSL/TLS certificates such as creating, storing, and renewing. ACM handles almost all operational complexity on our behalf to concentrate on the essential application development. That is a massive benefit of using the service if you want to provide a safe web service using SSL/TLS. (Of course, all websites should use SSL/TLS as default)

The other day, I encountered a situation where ACM showed up the error message like:

Request failed The status of this certificate request is “Failed”. Additional verification is required to request certificates for one or more domain names in this request.

The certificate request failed. What’s that? I usually pass the verification process without any trouble. So what do I need to do to deal with the additional verification process?

The forum gave me a clear answer.

Usually, this error appears when an ACM certificate request contains a domain listed under the Alexa Top 1000 domains. This process is in place to prevent abuse.

Indeed, the target domain I have requested the certificate for is listed in the top 1000 in the Alex ranking at the time. :) Therefore, we rarely see such a situation unless you have an extensive popular domain in the world.

The only way to resolve the issue is to file a support ticket to ask AWS to put our domain in the whitelist. That seems to work as the additional verification in this case. AWS support team will promptly respond to your problem, and your certificate will be issued once the ticket is closed.

Reference

Generate JSON response in Stoplight

Well-written documentation of a web API is an indispensable resource to learn its usage for users. It should be comprehensive and concise, covering all endpoints with supporting parameters and response format. OpenAPI allows us to provide the documentation in a standard manner to use various tools to enhance documentation generation.

But it’s still not fun. OpenAPI requires us to write response formats by hand, which is very time-consuming and error-prone. I need to use my brain and hand to document the correct YAML entry alongside looking at the actual response.

Today, I’ve found Stoplight provides an auto-generation functionality for the response model. Stoplight automatically generates valid YAML or JSON corresponding to the given response format. For example, let’s say our endpoint will return the following JSON response in case of 200 status code.

{
    "request_id": 1234,
    "users": [
        {"name": "Alice", "age": 12},
        {"name": "Bob", "age": 14}
    ]
}

It contains a unique ID for the request and user entries matching with the request query. Afterward, we can generate the model schema by just clicking the Generate button as follows.

Auto Generate

We see the model schema like this.

Model

The example value and type of each entry are also properly configured.

It’s by far more agile and manageable to complete the model schema than putting each item one by one. I should have known this helpful functionality earlier. :)

Order Sensitive Heterogeneous Partitioning in Glow

Glow supports the heterogenous partitioning which allows us to split the input model into multiple segments according to the given device configuration.

Partition

A Glow backend sometimes has unsupported operators due to the limitation of the functionality. Additionally, some backend provides more performant execution for the specific operator. Graph partitioning gives us more chance to improve the performance and reliability by making the most of the available resources for the computation.

It is necessary to write the device configuration to achieve the heterogeneous partitioning as follows.

---
- name:     Device1
  backendName: CPU
  parameters: |
    "deviceID" : "0"
- name:     Device2
  backendName: OpenCL
  parameters: |
    "nonSupportedNodes": "ResizeBilinear"
    "deviceID": "1"

This file instructs Glow to be aware of two platforms to run the partition. One is CPU on the host machine; the other is GPU device providing OpenCL API. Since OpenCL backend does not support ResizeBilinear, we marked it as nonSupportedNodes so that Glow will automatically put the operator on CPU instead.

But when I tried to partition the MobileNet v3 model in ONNX, I’ve got the following message.

$ ./bin/image-classifier \
  -model=../../mobilenetv2-7.onnx \
  -load-device-configs=../../heterogeneousConfig-bad.yaml \
  -log-partition=true \
  ../../glow/tests/images/imagenet/cat_285.png \
  -model-input-name=input \
  -onnx-define-symbol=batch_size,1
...
I0629 06:16:44.610828 23057 Partitioner.cpp:88] The number of partitions is : 1
I0629 06:16:44.610846 23057 PartitionerUtils.cpp:549]    Partition 0:
     Name : ../../mobilenetv2-7.onnx_part1_part1
     BackendKind :  CPU
     context count :  1
     total Memory : 14557376
       input size:  602112
       input count :  1
       input only from peers count :  0
       output size: 4000
       constant size: 13951264

No partitioning looks happen, and all operators seem to be assigned to the CPU backend. That’s a bizarre situation.

After a while digging deeper into the code base, I found the cause in partitioner of Glow.

Expected<DAGListTy> Partitioner::backendBasedPartition(
    FunctionToBackendNameMap &funcToBackend, Function *F,
    std::vector<Backend *> &backends, CompilationContext &cctx) {
  NodeToFunctionMap mapping;
  llvm::DenseMap<Node *, std::string> nodeToBackendName;

  // For each node find a backend that supports it.
  for (auto &N : F->getNodes()) {
    for (auto &backend : backends) {
      // Find the first backend that supports this node. The order of backends
      // is important. The check flow is :
      // ...
    }
    // ...
  }

}

The algorithm is first-come-first-served, which always prefers the backend coming first if it supports the operator. Accidentally, the CPU backend generally supports more operators than the OpenCL backend. Therefore, the configuration file having CPU first always leads to the single partition backed by CPU. To overcome the situation, we can reorder the backends listed in the configuration file.

---
- name:     Device2
  backendName: OpenCL
  parameters: |
    "nonSupportedNodes": "ResizeBilinear"
    "deviceID": "1"
- name:     Device1
  backendName: CPU
  parameters: |
    "deviceID" : "0"

I’ve just switched the order of CPU and OpenCL devices.

I0629 06:31:54.274185 23240 Partitioner.cpp:88] The number of partitions is : 1
I0629 06:31:54.274202 23240 PartitionerUtils.cpp:549]    Partition 0:
     Name : ../../mobilenetv2-7.onnx_part1_part1
     BackendKind :  OpenCL
     context count :  1
     total Memory : 14557376
       input size:  602112
       input count :  1
       input only from peers count :  0
       output size: 4000
       constant size: 13951264
I0629 06:31:54.274243 23240 PartitionerUtils.cpp:5

Now every operator is assigned to the OpenCL backend. If a model like FCN containing resize operator, we will get the following partitioning layout.

I0629 06:34:34.624155 23388 Partitioner.cpp:88] The number of partitions is : 3
I0629 06:34:34.624177 23388 PartitionerUtils.cpp:549]    Partition 0:
     Name : ../../fcn.onnx_part1_part1
     BackendKind :  OpenCL
     context count :  1
     total Memory : 217777448
       input size:  602112
       input count :  1
       input only from peers count :  0
       output size: 131712
       constant size: 217043624
I0629 06:34:34.624246 23388 PartitionerUtils.cpp:570]      LogicalDeviceIDs : 1
I0629 06:34:34.624260 23388 PartitionerUtils.cpp:549]    Partition 1:
     Name : ../../fcn.onnx_part2_part1
     BackendKind :  CPU
     context count :  1
     total Memory : 8561280
       input size:  131712
       input count :  2
       input only from peers count :  0
       output size: 8429568
       constant size: 0
I0629 06:34:34.624302 23388 PartitionerUtils.cpp:570]      LogicalDeviceIDs : 0
I0629 06:34:34.624315 23388 PartitionerUtils.cpp:549]    Partition 2:
     Name : ../../fcn.onnx_part3_part1
     BackendKind :  OpenCL
     context count :  1
     total Memory : 16859136
       input size:  8429568
       input count :  2
       input only from peers count :  0
       output size: 8429568
       constant size: 0

One thing to note in this article is the order-sensitiveness of the device configuration in Glow. So we should put the weaker device first and the strong backend like Interpreter or CPU to increase the chance of balanced distribution of partitions.

Reference

New C Frontend for MLIR Affine

MLIR is a powerful tool to represent the complicated semantics with retaining the high-level structure of the code. The most notable thing about MLIR is polyhedral compilation support by Affine dialect. The capability of expressing the static control part (SCoP) of the program allows us to accomplish the tailored optimization caring more about the semantics of the nested loop.

But current toolchain of the MLIR does not necessarily lead us to compile the existing C/C++ program easily. Additionally, we cannot make use of the existing polyhedral compilation like Polly, Pluto as they are not aware of the MLIR signatures.

Today, I found a paper about a tool working as a frontend for C/C++, enabling us to a polyhedral compilation on MLIR.

Polygeist: Affine C in MLIR

Polygeist makes two things possible. The first is converting the given C code into MLIR code described in Standard/SCF/Affine dialects. We can utilize MLIR infrastructure for the code transformation. The second is the integration with the existing polyhedral compilation framework. Polygeist converts the C code into the format which other polyhedral tools can recognize. It is a standard format expected by these tools, such as the instantiation form of each statement with induction variables. (e.g. $S(i1, i2)$)

Polygeist Flow

Although Polygeist does not regard the performance as a first-class scope, the code generated by the framework shows the competitive performance.

I found a code in GitHub but not sure the repository is truly for Polygeist because it does not have any reference to the paper.

https://github.com/wsmoses/Polygeist

Reference

Control Log Level of Terraform

Terraform in the cloud writes the stderr log in the console so that we can quickly discover what resources are created, changed, and destroyed. However, it may sometimes be too lengthy to grasp the complete information quickly. As we do for the typical long-running application, it is necessary to control the log level to suppress the output of unessential descriptions.

TF_LOG environment variable is the way to control what type of information is written to the stderr.

Variables

This variable is, of course, usable in the local Terraform. Additionally, we can make the environment variable effective by setting that in the variables pane in the dashboard of Terraform cloud.

It simplifies the process of looking into the log written by Terraform for you.

Reference